Joins
Joins are the core of LiteJoin. They combine data from multiple topics using standard SQL and emit results to sinks every time new data arrives.
How Joins Work
- Data arrives from a source and is written to a topic (SQLite table).
- The joiner checks which join queries reference that topic.
- Those join queries are re-evaluated against the current state of all referenced topics.
- Results are grouped by a result key and emitted to the configured sink.
Joins are reactive — they fire on every write, not on a schedule.
Configuration
joins:
- name: order-user-join
query: |
SELECT
o.key as order_id,
json_extract(o.payload, '$.amount') as amount,
json_extract(u.payload, '$.name') as customer_name,
json_extract(u.payload, '$.email') as customer_email
FROM orders o
INNER JOIN users u
ON json_extract(o.payload, '$.user_id') = u.key
WHERE o.timestamp > (strftime('%s', 'now') - 3600)
sink: webhook-out
Fields
| Field | Type | Required | Description |
|---|
name | string | yes | Unique name for the join. |
query | string | yes | SQL query joining one or more topics. |
sink | string | yes | Name of the sink to emit results to. |
key_column | string | no | Column to use as the result key for grouping. Defaults to the first column. |
result_key | string | no | Alias for the result key in the output. |
Join Types
LiteJoin supports all standard SQL join types:
INNER JOIN
Returns only rows where both sides match:
SELECT o.key, o.payload, u.payload
FROM orders o
INNER JOIN users u ON json_extract(o.payload, '$.user_id') = u.key
WHERE o.timestamp > (strftime('%s', 'now') - 3600)
LEFT JOIN
Returns all rows from the left table, with NULL for unmatched right-side rows:
SELECT o.key, o.payload, u.payload
FROM orders o
LEFT JOIN users u ON json_extract(o.payload, '$.user_id') = u.key
WHERE o.timestamp > (strftime('%s', 'now') - 3600)
Multi-way Joins
Join three or more topics:
SELECT
o.key as order_id,
json_extract(o.payload, '$.amount') as amount,
json_extract(u.payload, '$.name') as customer,
json_extract(p.payload, '$.name') as product
FROM orders o
LEFT JOIN users u ON json_extract(o.payload, '$.user_id') = u.key
LEFT JOIN products p ON json_extract(o.payload, '$.product_id') = p.key
WHERE o.timestamp > (strftime('%s', 'now') - 3600)
Join results are emitted as JoinResult objects:
{
"query": "order-user-join",
"key": "order_123",
"rows": [
{
"order_id": "order_123",
"amount": 45.00,
"customer_name": "Alice",
"customer_email": "alice@example.com"
}
],
"emit_at": "2026-02-22T10:00:00Z"
}
Best Practices
Always include a time-bounded WHERE clause. Without it, join queries scan the entire topic, degrading performance as data grows. Use timestamp > (strftime('%s', 'now') - N) to limit to recent data.
- Index your join keys. Use the
key column for joins when possible — it’s indexed by default.
- Keep queries focused. Each join should produce one type of output. Use multiple joins for different result shapes.
- Use
json_extract() for access. All application data lives in the payload JSON column.
- Test with Studio. Use LiteJoin Studio to iterate on queries with live data before deploying.
Example: Payment Enrichment
sources:
- name: stripe_charges
type: api
topic: charges
api:
url: "https://api.stripe.com/v1/charges?limit=100"
interval: 10s
key_path: "id"
response_path: "data"
headers:
Authorization: "Bearer ${STRIPE_SECRET_KEY}"
- name: stripe_customers
type: api
topic: customers
api:
url: "https://api.stripe.com/v1/customers?limit=100"
interval: 5m
key_path: "id"
response_path: "data"
headers:
Authorization: "Bearer ${STRIPE_SECRET_KEY}"
joins:
- name: charge-customer-join
query: |
SELECT
c.key as charge_id,
json_extract(c.payload, '$.amount') as amount,
json_extract(c.payload, '$.currency') as currency,
json_extract(cust.payload, '$.email') as customer_email,
json_extract(cust.payload, '$.name') as customer_name
FROM charges c
LEFT JOIN customers cust
ON json_extract(c.payload, '$.customer') = cust.key
WHERE c.timestamp > (strftime('%s', 'now') - 60)
sink: webhook-out
sinks:
- type: http
name: webhook-out
config:
url: "http://localhost:9000/enriched-charges"