Documentation Index
Fetch the complete documentation index at: https://docs.litejoin.io/llms.txt
Use this file to discover all available pages before exploring further.
Joins
Joins are the core of LiteJoin. They combine data from multiple topics using standard SQL and emit results to sinks every time new data arrives.
How Joins Work
- Data arrives from a source and is written to a topic (SQLite table).
- The joiner checks which join queries reference that topic.
- Those join queries are re-evaluated against the current state of all referenced topics.
- Results are grouped by a result key and emitted to the configured sink.
Joins are reactive — they fire on every write, not on a schedule.
Configuration
joins:
- name: order-user-join
query: |
SELECT
o.key as order_id,
json_extract(o.payload, '$.amount') as amount,
json_extract(u.payload, '$.name') as customer_name,
json_extract(u.payload, '$.email') as customer_email
FROM orders o
INNER JOIN users u
ON json_extract(o.payload, '$.user_id') = u.key
WHERE o.timestamp > (strftime('%s', 'now') - 3600)
sink: webhook-out
Fields
| Field | Type | Required | Description |
|---|
name | string | yes | Unique name for the join. |
query | string | yes | SQL query joining one or more topics. |
sink | string | yes | Name of the sink to emit results to. |
key_column | string | no | Column to use as the result key for grouping. Defaults to the first column. |
result_key | string | no | Alias for the result key in the output. |
Join Types
LiteJoin supports all standard SQL join types:
INNER JOIN
Returns only rows where both sides match:
SELECT o.key, o.payload, u.payload
FROM orders o
INNER JOIN users u ON json_extract(o.payload, '$.user_id') = u.key
WHERE o.timestamp > (strftime('%s', 'now') - 3600)
LEFT JOIN
Returns all rows from the left table, with NULL for unmatched right-side rows:
SELECT o.key, o.payload, u.payload
FROM orders o
LEFT JOIN users u ON json_extract(o.payload, '$.user_id') = u.key
WHERE o.timestamp > (strftime('%s', 'now') - 3600)
Multi-way Joins
Join three or more topics:
SELECT
o.key as order_id,
json_extract(o.payload, '$.amount') as amount,
json_extract(u.payload, '$.name') as customer,
json_extract(p.payload, '$.name') as product
FROM orders o
LEFT JOIN users u ON json_extract(o.payload, '$.user_id') = u.key
LEFT JOIN products p ON json_extract(o.payload, '$.product_id') = p.key
WHERE o.timestamp > (strftime('%s', 'now') - 3600)
Join results are emitted as JoinResult objects:
{
"query": "order-user-join",
"key": "order_123",
"rows": [
{
"order_id": "order_123",
"amount": 45.00,
"customer_name": "Alice",
"customer_email": "alice@example.com"
}
],
"emit_at": "2026-02-22T10:00:00Z"
}
Best Practices
Always include a time-bounded WHERE clause. Without it, join queries scan the entire topic, degrading performance as data grows. Use timestamp > (strftime('%s', 'now') - N) to limit to recent data.
- Index your join keys. Use the
key column for joins when possible — it’s indexed by default.
- Keep queries focused. Each join should produce one type of output. Use multiple joins for different result shapes.
- Use
json_extract() for access. All application data lives in the payload JSON column.
- Test with Studio. Use LiteJoin Studio to iterate on queries with live data before deploying.
Example: Payment Enrichment
sources:
- name: stripe_charges
type: api
topic: charges
api:
url: "https://api.stripe.com/v1/charges?limit=100"
interval: 10s
key_path: "id"
response_path: "data"
headers:
Authorization: "Bearer ${STRIPE_SECRET_KEY}"
- name: stripe_customers
type: api
topic: customers
api:
url: "https://api.stripe.com/v1/customers?limit=100"
interval: 5m
key_path: "id"
response_path: "data"
headers:
Authorization: "Bearer ${STRIPE_SECRET_KEY}"
joins:
- name: charge-customer-join
query: |
SELECT
c.key as charge_id,
json_extract(c.payload, '$.amount') as amount,
json_extract(c.payload, '$.currency') as currency,
json_extract(cust.payload, '$.email') as customer_email,
json_extract(cust.payload, '$.name') as customer_name
FROM charges c
LEFT JOIN customers cust
ON json_extract(c.payload, '$.customer') = cust.key
WHERE c.timestamp > (strftime('%s', 'now') - 60)
sink: webhook-out
sinks:
- type: http
name: webhook-out
config:
url: "http://localhost:9000/enriched-charges"