USE CASE · AI SEARCH & RETRIEVAL

AI-NATIVE
SEARCH &
RETRIEVAL.

Your RAG pipeline has a vector database that doesn't know about text, and a text search engine that doesn't know about vectors. XERJ runs BM25 and HNSW in one query tree, one cost model, one execution pass — so hybrid search is a feature, not an integration project.

RETRIEVAL INTENT CLUSTERS · 48H · UMAP 2D
······························································································································································································································································································································································································································································································································································································································································································································································································································································ RAG RETRIEVALCODE ASSISTDOC Q&AEXTRACT JSONCLASSIFYAGENT TOOL
UMAP · 830 embeddings · 6 clusters

THE TWO-SYSTEM PROBLEM

THE XERJ ANSWER

MEMORY SAVINGS
SQ8 quantization: ~1–2% recall loss
1
QUERY FOR HYBRID SEARCH
BM25 + HNSW fusion in one execution pass
16,384
MAX DIMENSIONS
4× Elasticsearch's 4,096 limit
MEMORY · XERJ SQ8 vs ES + PINECONE (FLOAT32)
XERJ (SQ8) ES + PINECONE
1M × 768-dim 1.2 GB 4.6 GB 3.8×
5M × 768-dim 5.8 GB 23 GB 4.0×
10M × 1536-dim 18 GB 92 GB 5.1×

SEE IT LIVE.

The playbook walks the full recipe — schema, ingest command, queries, and the dashboard. The playground runs on seeded data; benchmarks were measured against Elasticsearch 8.13 on 2026-04-14.

OPEN THE PLAYBOOK OPEN THE PLAYGROUND
READY?·REQUEST ACCESS

RUN IT ON
YOUR DATA.

Send us your embedding model and a sample corpus. We'll run hybrid search benchmarks — recall@10, latency p95, memory per million vectors — and share the reproduction.

We only use this email to send you the binary. Ever. ✓ THANKS. CHECK YOUR INBOX WITHIN 24 HOURS.