Backup & restore
Segments are immutable once written and the WAL is append-only, so backups are a plain filesystem-level operation. No snapshot API, no coordination dance — you copy the data directory while the server is running and restore it by copying it back.
Hot backup
Segments never change after they land on disk, so a running rsync of the data directory is consistent per-segment. The only moving pieces are:
- The tail WAL file (may get appends during the copy).
- The active memtables (not on disk yet).
Before the rsync, call POST /v1/admin/flush to force a memtable flush so recent writes are persisted as segments:
$ curl -sXPOST -H "Authorization: ApiKey $XERJ_API_KEY" \
http://127.0.0.1:8080/v1/admin/flush
{"flushed_indices": 4, "segments_written": 4, "wal_checkpoint": 12847}
Then rsync the data dir to your backup target. Repeat rsync at the snapshot window you want — because segments are content-addressed, subsequent rsyncs only transfer new segments.
$ rsync -aH --delete \
/var/lib/xerj/ \
backup.internal:/srv/backups/xerj/$(date +%F)/
This is compatible with ZFS snapshots, LVM snapshots, or any filesystem-level backup tool. If you use ZFS, the snapshot after a flush is genuinely point-in-time consistent and you don't need the rsync step.
S3 / object store backup
XERJ exposes POST /v1/admin/backup which streams a full copy of the data directory to any S3-compatible endpoint:
$ curl -sXPOST -H "Authorization: ApiKey $XERJ_API_KEY" \
-H "Content-Type: application/json" \
http://127.0.0.1:8080/v1/admin/backup \
-d '{
"destination": "s3://my-backups/xerj/nightly",
"credentials": {
"endpoint": "https://s3.us-east-1.amazonaws.com",
"access_key": "AKIA...",
"secret_key": "..."
},
"flush_first": true
}'
{"backup_id":"bk-2026-04-15T03-00-00Z","bytes":142857600,"duration_ms":8421}
Backups are incremental by default — only segments the destination doesn't already hold are uploaded.
Restore
Stop the server, replace the data directory, start the server. No import step — the WAL replay on startup reconstructs memtable state from whatever files are on disk.
$ sudo systemctl stop xerj $ sudo rm -rf /var/lib/xerj/* $ sudo rsync -aH backup.internal:/srv/backups/xerj/2026-04-15/ /var/lib/xerj/ $ sudo chown -R xerj:xerj /var/lib/xerj $ sudo systemctl start xerj $ curl -sf http://127.0.0.1:8080/v1/health/ready
Per-index backup
To back up a single index (e.g. before a destructive migration), copy its subdirectory after a flush:
$ curl -sXPOST -H "Authorization: ApiKey $XERJ_API_KEY" \
http://127.0.0.1:8080/v1/indices/events/flush
$ sudo tar czf events-$(date +%F).tar.gz -C /var/lib/xerj events
Clustered backup
In a cluster, each node holds a subset of shards — you need a consistent backup of every node. The simplest pattern is to drain one node at a time, snapshot it, and reactivate:
# Drain $ curl -sXPOST .../v1/cluster/nodes/b/drain # Wait for "shards_remaining": 0 $ rsync -aH /var/lib/xerj/ backup.internal:/srv/backups/xerj/b/ $ curl -sXPOST .../v1/cluster/nodes/b/activate
Alternatively, take per-node S3 backups nightly and trust that Raft replay will reconcile the metadata when a restored node rejoins.
Source · engine/crates/storage/src/backend.rs · engine/crates/api/src/native.rs