High-performance BM25 + HNSW vector search using category theory, written in Rust
Project description
Vajra Search Engine (vajra-search)
Rust-backed search framework with a Python interface for:
- lexical BM25 search,
- vector ANN search (HNSW),
- hybrid BM25 + vector fusion.
The package ships with a compiled Rust extension (vajra_search._vajra_search) and Python orchestration layers for embeddings, vector indexing, and hybrid fusion.
Installation
Base install:
pip install vajra-search
Optional embedding-model dependency (for TextEmbeddingMorphism):
pip install "vajra-search[vector]"
Search Modes
1) Lexical Search (BM25)
from vajra_search import Document, DocumentCorpus, VajraSearch
docs = [
Document("1", "Rust for Search", "Rust enables predictable low-level performance."),
Document("2", "BM25 Overview", "BM25 is a lexical ranking algorithm for keyword search."),
Document("3", "Hybrid Retrieval", "Hybrid retrieval combines lexical and vector signals."),
]
corpus = DocumentCorpus(docs)
engine = VajraSearch(corpus, k1=1.5, b=0.75)
results = engine.search("bm25 keyword ranking", top_k=3)
for r in results:
# BM25 rank from the Rust layer is zero-based; display as one-based.
print(f"rank={r.rank + 1} id={r.doc_id} score={r.score:.4f} title={r.title}")
2) Vector Search (HNSW)
The example below uses a tiny deterministic embedder so it runs without external model downloads.
from typing import List
import numpy as np
from vajra_search import (
Document,
NativeHNSWIndex,
VajraVectorSearch,
)
from vajra_search.embeddings import EmbeddingMorphism
class TinyEmbedding(EmbeddingMorphism[str]):
"""Very small keyword-count embedder for demos/tests."""
VOCAB = ("rust", "search", "bm25", "vector")
@property
def dimension(self) -> int:
return len(self.VOCAB)
def embed(self, text: str) -> np.ndarray:
t = text.lower()
vec = np.array([t.count(tok) for tok in self.VOCAB], dtype=np.float32)
norm = np.linalg.norm(vec)
return vec / norm if norm > 0 else vec
def embed_batch(self, texts: List[str]) -> np.ndarray:
return np.vstack([self.embed(t) for t in texts]).astype(np.float32)
docs = [
Document("1", "Rust Search", "Rust vector search with HNSW."),
Document("2", "Lexical BM25", "BM25 is strong for exact keyword matching."),
Document("3", "Vector Retrieval", "Vector search captures semantic similarity."),
]
embedder = TinyEmbedding()
index = NativeHNSWIndex(dimension=embedder.dimension, metric="cosine", max_elements=100)
vsearch = VajraVectorSearch(embedder, index)
vsearch.index_documents(docs, show_progress=False)
results = vsearch.search("vector search in rust", top_k=3)
for r in results:
print(f"rank={r.rank} id={r.id} score={r.score:.4f} title={r.document.title}")
3) Hybrid Search (BM25 + Vector)
from typing import List
import numpy as np
from vajra_search import (
Document,
DocumentCorpus,
HybridSearchEngine,
NativeHNSWIndex,
VajraSearch,
VajraVectorSearch,
)
from vajra_search.embeddings import EmbeddingMorphism
class TinyEmbedding(EmbeddingMorphism[str]):
VOCAB = ("rust", "search", "bm25", "vector")
@property
def dimension(self) -> int:
return len(self.VOCAB)
def embed(self, text: str) -> np.ndarray:
t = text.lower()
vec = np.array([t.count(tok) for tok in self.VOCAB], dtype=np.float32)
norm = np.linalg.norm(vec)
return vec / norm if norm > 0 else vec
def embed_batch(self, texts: List[str]) -> np.ndarray:
return np.vstack([self.embed(t) for t in texts]).astype(np.float32)
docs = [
Document("1", "Rust HNSW", "Rust implementation of HNSW vector search."),
Document("2", "BM25 Fundamentals", "BM25 ranks documents by lexical relevance."),
Document("3", "Hybrid Ranking", "Hybrid ranking combines BM25 and vector signals."),
]
corpus = DocumentCorpus(docs)
bm25 = VajraSearch(corpus)
embedder = TinyEmbedding()
index = NativeHNSWIndex(dimension=embedder.dimension, metric="cosine", max_elements=100)
vector = VajraVectorSearch(embedder, index)
vector.index_documents(docs, show_progress=False)
hybrid = HybridSearchEngine(bm25, vector, alpha=0.5, method="rrf")
results = hybrid.search("rust vector search ranking", top_k=3)
for r in results:
print(f"rank={r.rank} id={r.id} score={r.score:.4f} title={r.document.title}")
Runnable Examples
These scripts are included in-repo:
examples/lexical_search.pyexamples/vector_search.pyexamples/hybrid_search.py
Run:
python examples/lexical_search.py
python examples/vector_search.py
python examples/hybrid_search.py
API Surface (Python)
Main exports:
- BM25:
Document,DocumentCorpus,BM25Params,VajraSearch,VajraSearchParallel - HNSW:
HnswIndex(raw Rust binding),NativeHNSWIndex(Python wrapper) - Vector layer:
VajraVectorSearch,VectorSearchResult - Hybrid layer:
HybridSearchEngine - Embeddings:
TextEmbeddingMorphism,PrecomputedEmbeddingMorphism,IdentityEmbeddingMorphism
Persistence
- Vector index persistence is exposed via:
NativeHNSWIndex.save(path)NativeHNSWIndex.load(path)VajraVectorSearch.save(path)/VajraVectorSearch.load(path, embedder, index_class)
Reproducibility and Benchmarks
- Repro steps:
reproduction.md - Benchmark harness and datasets are documented in the companion benchmark repos referenced from the project documentation.
Benchmark Snapshot (Python Interface)
Measured on 2026-03-02 on Darwin arm64, Python 3.13.7 using:
- query protocol: 10 warmup + 100 measured queries (
top_k=10) - corpus: deterministic synthetic topic-keyword documents with mixed selectivity queries (broad + selective)
- modes benchmarked through the Python API (
VajraSearch,VajraSearchParallel,VajraVectorSearch,HybridSearchEngine) lexical_parallelmeasures per-query latency from batchedsearch_batchexecutionvectornumbers use a tiny deterministic embedder, so they represent index-path latency (not transformer inference latency)
| Size | Mode | Build (s) | p50 (ms) | p95 (ms) | p99 (ms) | QPS |
|---|---|---|---|---|---|---|
| 1,000 | lexical | 0.006 | 0.019 | 0.113 | 0.114 | 25799.8 |
| 1,000 | lexical_parallel | 0.004 | 0.020 | 0.024 | 0.024 | 49387.8 |
| 1,000 | vector | 0.055 | 0.016 | 0.019 | 0.019 | 62366.4 |
| 1,000 | hybrid | 0.060 | 0.057 | 0.158 | 0.223 | 11873.4 |
| 10,000 | lexical | 0.048 | 0.244 | 1.410 | 1.865 | 2018.7 |
| 10,000 | lexical_parallel | 0.044 | 0.157 | 0.171 | 0.171 | 6664.5 |
| 10,000 | vector | 0.508 | 0.017 | 0.018 | 0.019 | 61373.2 |
| 10,000 | hybrid | 0.566 | 0.279 | 1.651 | 1.740 | 2039.0 |
| 20,000 | lexical | 0.091 | 0.517 | 4.436 | 6.013 | 759.2 |
| 20,000 | lexical_parallel | 0.089 | 0.281 | 0.331 | 0.331 | 3512.9 |
| 20,000 | vector | 1.005 | 0.017 | 0.039 | 0.060 | 47738.4 |
| 20,000 | hybrid | 1.192 | 0.659 | 4.042 | 4.402 | 785.4 |
| 50,000 | lexical | 0.238 | 1.859 | 12.796 | 13.658 | 237.9 |
| 50,000 | lexical_parallel | 0.276 | 1.115 | 1.257 | 1.257 | 882.2 |
| 50,000 | vector | 2.625 | 0.017 | 0.026 | 0.035 | 55942.5 |
| 50,000 | hybrid | 2.815 | 1.837 | 12.550 | 13.674 | 237.9 |
Re-run this benchmark:
./.venv/bin/python scripts/benchmark_python_modes.py --sizes 1000 10000 20000 50000
Raw outputs are written to:
scripts/benchmark_python_modes_latest.jsonscripts/benchmark_python_modes_latest.md
Wikipedia Vector Benchmark Snapshot (Companion Harness)
For production-style vector benchmarking against Wikipedia embeddings (1k/10k/20k/50k) and ZVec comparison, use the companion harness documented in reproduction.md.
Important for reproducible build-time comparisons: install the local extension with native CPU flags enabled:
RUSTFLAGS="-C target-cpu=native" pip install -e ~/Github/vajra_search_engine --no-build-isolation
One 50k snapshot from that track (fresh run on 2026-03-04):
| Engine/Profile | Build (s) | p50 (ms) | QPS | Recall@10 |
|---|---|---|---|---|
| ZVec | 2.724 | 0.764 | 1312.7 | 0.998 |
| Vajra quality | 51.083 | 0.208 | 4753.0 | 0.998 |
| Vajra fast | 17.340 | 0.170 | 5682.1 | 0.908 |
| Vajra instant | 4.194 | 0.075 | 10787.2 | 0.718 |
Release Checks
Before publishing to PyPI:
./scripts/release_check.sh
This validates:
- Python tests
- coverage threshold (
>=80%) - Rust workspace tests (with pinned PyO3 interpreter)
- wheel build + clean-venv install/import smoke test
Release Process
- Tag push (
v*) builds cross-platform wheels/sdist and creates a GitHub Release. - PyPI upload is a separate manual action via GitHub workflow (
Publish PyPI), using the chosen tag. - Full runbook:
RELEASING.md
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vajra_search-0.2.1.tar.gz.
File metadata
- Download URL: vajra_search-0.2.1.tar.gz
- Upload date:
- Size: 77.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d77a0a99483918268400ab973c8206c41444fb6f330add69f4eb093261b6d7da
|
|
| MD5 |
0e8bf6dbb9417fe11d89e293733a6aeb
|
|
| BLAKE2b-256 |
0955250e3f9e4d41966fdc7754ea9ea22433677c3a3350ee16b2e314e9f5d4e0
|
File details
Details for the file vajra_search-0.2.1-cp314-cp314-macosx_11_0_arm64.whl.
File metadata
- Download URL: vajra_search-0.2.1-cp314-cp314-macosx_11_0_arm64.whl
- Upload date:
- Size: 955.4 kB
- Tags: CPython 3.14, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e51e7c7861acc4b7bb83437febfbd399ad530bbe78cc277a7e94911527fd068
|
|
| MD5 |
ba51d1e55ddbf1531de6f87bf40e1696
|
|
| BLAKE2b-256 |
62dae9c1806034f22a3b791d439c7836a844d3ced755ee12f3d0c0b6b5801ef4
|
File details
Details for the file vajra_search-0.2.1-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: vajra_search-0.2.1-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 914.3 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03b8f5167a82692590e8cc8b2765ebe99221dda58977bd1d2387cbf74a85dee6
|
|
| MD5 |
459217ff7cd1165df0891d5299c1e970
|
|
| BLAKE2b-256 |
0f958bef0b763260feded4ec04516ce4872269a04b8030cdb0f88675a1a0fc96
|
File details
Details for the file vajra_search-0.2.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: vajra_search-0.2.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e5d9af8d908c34e0360c9081ab1be2ce1f43222c85189eefb653eba70e59734
|
|
| MD5 |
bcd014f40de548dffb832b9518075870
|
|
| BLAKE2b-256 |
6a1032673edf79cf94d7caa67a9d61d52f83f4018abb9a35f1256b4a02b7c410
|