API Source
Theapi source type polls any HTTP endpoint on a configurable interval, diffs the response against previously stored state in SQLite, and emits only the changes as standard pipeline messages. Any request/response API becomes a real-time event stream — no webhooks, no changes to the upstream system, no event infrastructure required.
Quick Start
Poll an OpenWeatherMap endpoint every 30 seconds and emit changes:url, interval, and key_path. Everything else has a sensible default.
Config Reference
Source-Level Fields
| Field | Type | Required | Description |
|---|---|---|---|
type | string | yes | Must be "api" |
name | string | yes | Unique name for this source |
topic | string | yes | Topic (table) to write events to |
api Block Fields
Core
| Field | Type | Default | Description |
|---|---|---|---|
url | string | required | Full URL to poll. Supports ${ENV_VAR} expansion. |
method | string | GET | HTTP method: GET, POST, or PUT. |
interval | duration | required | How often to poll (e.g. 30s, 1m, 5m). |
key_path | gjson path | required | Path within each record to the unique key. |
response_path | gjson path | "" | Path to the array of records in the response. Empty means top-level array or single object. |
change_detection | string | diff | "diff" (field-level) or "hash" (SHA-256). |
detect_deletes | bool | false | Emit a deleted event when a record disappears between polls. |
timeout | duration | 30s | HTTP request timeout per page. |
initial_snapshot | bool | true | If false, suppresses emission on the first poll. |
headers | map | {} | HTTP headers. Values support ${ENV_VAR} expansion. |
body | string | "" | Request body for POST/PUT. |
max_consecutive_failures | int | 5 | Failed cycles before exponential backoff. |
Watermark Block
Controls incremental polling — how the source avoids re-fetching data it has already seen.| Field | Type | Default | Description |
|---|---|---|---|
strategy | string | none | none, timestamp, cursor, or etag. |
path | gjson path | "" | Path to extract the watermark value. |
param | string | "" | Query parameter name to set on subsequent requests. |
format | string | RFC3339 | Go time layout for timestamp strategy. |
initial | string | "" | Seed watermark value for first run. |
overlap | duration | 10s | Subtract from watermark to handle clock skew. |
Pagination Block
| Field | Type | Default | Description |
|---|---|---|---|
strategy | string | none | none, link_header, cursor, or offset. |
param | string | "" | Query parameter for cursor value. |
path | gjson path | "" | Path to extract cursor from response. |
has_more_path | gjson path | "" | Boolean path indicating more pages exist. |
offset_param | string | offset | Query parameter for offset. |
limit_param | string | limit | Query parameter for page size. |
limit | int | — | Page size (required for offset strategy). |
total_path | gjson path | "" | Path to total record count. |
max_pages | int | 100 | Hard cap on pages per poll cycle. |
Watermark Strategies
- none
- timestamp
- cursor
- etag
Full-scan on every poll. Diffs against stored state. Simple and correct for any API.Best for: small result sets, APIs with no filtering support.
Pagination Strategies
- link_header
- cursor
- offset
Follows RFC 8288
Link headers with rel="next".Best for: GitHub, GitLab, standards-compliant REST APIs.Change Detection Modes
diff (default)
Field-level comparison. Emits records annotated with _change metadata:
hash
SHA-256 hash of the full payload. If the hash changes, the record is emitted. No field-level metadata. Lower CPU overhead for large payloads.
Cookbook Examples
GitHub Pull Requests
Poll open PRs withLink header pagination and ETag caching:
Stripe Charges
Cursor-based pagination and watermark:Jira Issues
Timestamp watermark with offset pagination:Operational Guidance
Choosing a Poll Interval
| Use case | Suggested interval |
|---|---|
| Near-real-time tracking | 10s–30s |
| Operational data (orders, tickets) | 1m–5m |
| Reference data (users, products) | 5m–30m |
| Slow-changing config data | 1h or more |
Rate Limiting
LiteJoin handles rate limiting automatically:429 Too Many Requests— retried withRetry-Afterheader delayX-RateLimit-Remaining: 0— pauses until reset- 5xx errors — retried with exponential backoff (max 3 retries)
- 4xx errors (except 429) — not retried (configuration problem)
Environment Variables
Header values and URLs support${ENV_VAR} expansion:
gjson Paths
All path fields use gjson syntax:| Pattern | Example | What it accesses |
|---|---|---|
field | id | Top-level field |
a.b | data.id | Nested field |
#.field | #.id | Field from all array elements |
data.@last.id | — | Last element’s id in data array |