Skip to main content

Replication

LiteJoin integrates with Litestream for continuous backup of SQLite shards to cloud object storage. Litestream ships WAL frames to S3, GCS, or Azure Blob in near-real-time, providing disaster recovery with minimal data loss.

Why Replicate?

  • Data loss on crash — LiteJoin targets containerized deployments where local disk is ephemeral. A pod restart means data is gone.
  • Restore-on-startup — Containers can boot, restore shards from S3, and resume processing automatically.
  • Point-in-time recovery — Roll back to any point within the retention window after a bad deployment.
The simplest approach: run Litestream alongside LiteJoin as a sidecar process.

1. Create litestream.yml

dbs:
  - path: ./data/shard_0.db
    replicas:
      - type: s3
        bucket: my-litejoin-backups
        path: shard_0
        region: us-east-1
        sync-interval: 1s

  - path: ./data/shard_1.db
    replicas:
      - type: s3
        bucket: my-litejoin-backups
        path: shard_1
        region: us-east-1
        sync-interval: 1s

  # ... one entry per shard (match your shard_count)

2. Startup Script

#!/bin/bash
# run.sh — restore shards, then start Litestream with LiteJoin

# Restore all shards from replica (skips if local file exists)
for i in $(seq 0 7); do
  litestream restore -if-db-not-exists \
    -o ./data/shard_$i.db \
    s3://my-litejoin-backups/shard_$i
done

# Start Litestream replicate, exec LiteJoin as child process
exec litestream replicate -exec "litejoin run -c litejoin.yaml"
The -exec flag makes Litestream the parent process: it starts replication, launches LiteJoin as a subprocess, and gracefully flushes WAL frames on shutdown.

3. Dockerfile

FROM golang:1.24-alpine AS build
RUN apk add --no-cache gcc musl-dev
WORKDIR /app
COPY . .
RUN CGO_ENABLED=1 go build -o litejoin ./cmd/litejoin

FROM alpine:3.21
RUN apk add --no-cache litestream ca-certificates
COPY --from=build /app/litejoin /usr/local/bin/
COPY litestream.yml /etc/litestream.yml
COPY run.sh /run.sh
RUN chmod +x /run.sh
ENTRYPOINT ["/run.sh"]

Configuration

replication:
  enabled: true
  mode: "sidecar"
  replica_url: "s3://my-bucket/litejoin"
  sync_interval: 1s
  restore_on_boot: true
  snapshot_interval: 1h
  retention: 72h
FieldTypeDefaultDescription
enabledboolfalseEnable replication.
modestringsidecar"sidecar" or "embedded".
replica_urlstringS3/GCS/Azure URL for replicas.
sync_intervalduration1sHow often WAL frames are shipped.
restore_on_bootbooltrueRestore shards from replica on startup.
snapshot_intervalduration1hHow often full snapshots are created.
retentionduration72hHow long to keep replica history.

Replica Layout

s3://my-bucket/litejoin/
├── shard_0/
│   └── generations/
│       └── a1b2c3d4/
│           ├── snapshots/
│           └── wal/
├── shard_1/
│   └── ...
├── ...
└── dlq/
    └── ...   # DLQ database also replicated

Restore Flow

On startup with restore_on_boot: true:
  1. For each shard, check if the local file exists.
  2. If missing, restore from the latest snapshot + replay WAL frames from the replica.
  3. If present, skip restore and start replication.

Performance Impact

Litestream reads WAL frames asynchronously from disk — it does not sit in the write path. Benchmarks show < 1% write latency impact in sidecar mode.

Supported Backends

BackendURL Format
Amazon S3s3://bucket-name/prefix
Google Cloud Storagegcs://bucket-name/prefix
Azure Blob Storageabs://container-name/prefix
SFTPsftp://user@host/path
Local filefile:///path/to/backup
Litestream uses standard cloud SDK credentials (AWS_ACCESS_KEY_ID, GOOGLE_APPLICATION_CREDENTIALS, etc.). See the Litestream documentation for details.

Important Considerations

Litestream is a backup/recovery tool, not a high-availability solution. There is no automatic failover. If LiteJoin crashes, it must restart and restore — there is a recovery window bounded by sync_interval.
  • Shard count changes require re-sharding. The FNV hash distribution changes when you modify shard_count, invalidating existing replicas.
  • DLQ replication — the dlq.db database should also be replicated for complete disaster recovery.
  • Encryption — Litestream does not encrypt WAL frames. Use server-side encryption (SSE-S3, SSE-KMS) for encryption at rest.