Skip to main content

High-performance BM25 + HNSW vector search using category theory, written in Rust

Project description

Vajra Search Engine (vajra-search)

Rust-backed search framework with a Python interface for:

  • lexical BM25 search,
  • vector ANN search (HNSW),
  • hybrid BM25 + vector fusion.

The package ships with a compiled Rust extension (vajra_search._vajra_search) and Python orchestration layers for embeddings, vector indexing, and hybrid fusion.

Installation

Base install:

pip install vajra-search

Optional embedding-model dependency (for TextEmbeddingMorphism):

pip install "vajra-search[vector]"

Search Modes

1) Lexical Search (BM25)

from vajra_search import Document, DocumentCorpus, VajraSearch

docs = [
    Document("1", "Rust for Search", "Rust enables predictable low-level performance."),
    Document("2", "BM25 Overview", "BM25 is a lexical ranking algorithm for keyword search."),
    Document("3", "Hybrid Retrieval", "Hybrid retrieval combines lexical and vector signals."),
]

corpus = DocumentCorpus(docs)
engine = VajraSearch(corpus, k1=1.5, b=0.75)

results = engine.search("bm25 keyword ranking", top_k=3)
for r in results:
    # BM25 rank from the Rust layer is zero-based; display as one-based.
    print(f"rank={r.rank + 1} id={r.doc_id} score={r.score:.4f} title={r.title}")

2) Vector Search (HNSW)

The example below uses a tiny deterministic embedder so it runs without external model downloads.

from typing import List

import numpy as np

from vajra_search import (
    Document,
    NativeHNSWIndex,
    VajraVectorSearch,
)
from vajra_search.embeddings import EmbeddingMorphism


class TinyEmbedding(EmbeddingMorphism[str]):
    """Very small keyword-count embedder for demos/tests."""

    VOCAB = ("rust", "search", "bm25", "vector")

    @property
    def dimension(self) -> int:
        return len(self.VOCAB)

    def embed(self, text: str) -> np.ndarray:
        t = text.lower()
        vec = np.array([t.count(tok) for tok in self.VOCAB], dtype=np.float32)
        norm = np.linalg.norm(vec)
        return vec / norm if norm > 0 else vec

    def embed_batch(self, texts: List[str]) -> np.ndarray:
        return np.vstack([self.embed(t) for t in texts]).astype(np.float32)


docs = [
    Document("1", "Rust Search", "Rust vector search with HNSW."),
    Document("2", "Lexical BM25", "BM25 is strong for exact keyword matching."),
    Document("3", "Vector Retrieval", "Vector search captures semantic similarity."),
]

embedder = TinyEmbedding()
index = NativeHNSWIndex(dimension=embedder.dimension, metric="cosine", max_elements=100)
vsearch = VajraVectorSearch(embedder, index)
vsearch.index_documents(docs, show_progress=False)

results = vsearch.search("vector search in rust", top_k=3)
for r in results:
    print(f"rank={r.rank} id={r.id} score={r.score:.4f} title={r.document.title}")

3) Hybrid Search (BM25 + Vector)

from typing import List

import numpy as np

from vajra_search import (
    Document,
    DocumentCorpus,
    HybridSearchEngine,
    NativeHNSWIndex,
    VajraSearch,
    VajraVectorSearch,
)
from vajra_search.embeddings import EmbeddingMorphism


class TinyEmbedding(EmbeddingMorphism[str]):
    VOCAB = ("rust", "search", "bm25", "vector")

    @property
    def dimension(self) -> int:
        return len(self.VOCAB)

    def embed(self, text: str) -> np.ndarray:
        t = text.lower()
        vec = np.array([t.count(tok) for tok in self.VOCAB], dtype=np.float32)
        norm = np.linalg.norm(vec)
        return vec / norm if norm > 0 else vec

    def embed_batch(self, texts: List[str]) -> np.ndarray:
        return np.vstack([self.embed(t) for t in texts]).astype(np.float32)


docs = [
    Document("1", "Rust HNSW", "Rust implementation of HNSW vector search."),
    Document("2", "BM25 Fundamentals", "BM25 ranks documents by lexical relevance."),
    Document("3", "Hybrid Ranking", "Hybrid ranking combines BM25 and vector signals."),
]

corpus = DocumentCorpus(docs)
bm25 = VajraSearch(corpus)

embedder = TinyEmbedding()
index = NativeHNSWIndex(dimension=embedder.dimension, metric="cosine", max_elements=100)
vector = VajraVectorSearch(embedder, index)
vector.index_documents(docs, show_progress=False)

hybrid = HybridSearchEngine(bm25, vector, alpha=0.5, method="rrf")
results = hybrid.search("rust vector search ranking", top_k=3)
for r in results:
    print(f"rank={r.rank} id={r.id} score={r.score:.4f} title={r.document.title}")

Runnable Examples

These scripts are included in-repo:

  • examples/lexical_search.py
  • examples/vector_search.py
  • examples/hybrid_search.py

Run:

python examples/lexical_search.py
python examples/vector_search.py
python examples/hybrid_search.py

API Surface (Python)

Main exports:

  • BM25: Document, DocumentCorpus, BM25Params, VajraSearch, VajraSearchParallel
  • HNSW: HnswIndex (raw Rust binding), NativeHNSWIndex (Python wrapper)
  • Vector layer: VajraVectorSearch, VectorSearchResult
  • Hybrid layer: HybridSearchEngine
  • Embeddings: TextEmbeddingMorphism, PrecomputedEmbeddingMorphism, IdentityEmbeddingMorphism

Persistence

  • Vector index persistence is exposed via:
    • NativeHNSWIndex.save(path)
    • NativeHNSWIndex.load(path)
    • VajraVectorSearch.save(path) / VajraVectorSearch.load(path, embedder, index_class)

Reproducibility and Benchmarks

  • Repro steps: reproduction.md
  • Benchmark harness and datasets are documented in the companion benchmark repos referenced from the project documentation.

Benchmark Snapshot (Python Interface)

Measured on 2026-03-02 on Darwin arm64, Python 3.13.7 using:

  • query protocol: 10 warmup + 100 measured queries (top_k=10)
  • corpus: deterministic synthetic topic-keyword documents with mixed selectivity queries (broad + selective)
  • modes benchmarked through the Python API (VajraSearch, VajraSearchParallel, VajraVectorSearch, HybridSearchEngine)
  • lexical_parallel measures per-query latency from batched search_batch execution
  • vector numbers use a tiny deterministic embedder, so they represent index-path latency (not transformer inference latency)
Size Mode Build (s) p50 (ms) p95 (ms) p99 (ms) QPS
1,000 lexical 0.006 0.019 0.113 0.114 25799.8
1,000 lexical_parallel 0.004 0.020 0.024 0.024 49387.8
1,000 vector 0.055 0.016 0.019 0.019 62366.4
1,000 hybrid 0.060 0.057 0.158 0.223 11873.4
10,000 lexical 0.048 0.244 1.410 1.865 2018.7
10,000 lexical_parallel 0.044 0.157 0.171 0.171 6664.5
10,000 vector 0.508 0.017 0.018 0.019 61373.2
10,000 hybrid 0.566 0.279 1.651 1.740 2039.0
20,000 lexical 0.091 0.517 4.436 6.013 759.2
20,000 lexical_parallel 0.089 0.281 0.331 0.331 3512.9
20,000 vector 1.005 0.017 0.039 0.060 47738.4
20,000 hybrid 1.192 0.659 4.042 4.402 785.4
50,000 lexical 0.238 1.859 12.796 13.658 237.9
50,000 lexical_parallel 0.276 1.115 1.257 1.257 882.2
50,000 vector 2.625 0.017 0.026 0.035 55942.5
50,000 hybrid 2.815 1.837 12.550 13.674 237.9

Re-run this benchmark:

./.venv/bin/python scripts/benchmark_python_modes.py --sizes 1000 10000 20000 50000

Raw outputs are written to:

  • scripts/benchmark_python_modes_latest.json
  • scripts/benchmark_python_modes_latest.md

Wikipedia Vector Benchmark Snapshot (Companion Harness)

For production-style vector benchmarking against Wikipedia embeddings (1k/10k/20k/50k) and ZVec comparison, use the companion harness documented in reproduction.md.

Important for reproducible build-time comparisons: install the local extension with native CPU flags enabled:

RUSTFLAGS="-C target-cpu=native" pip install -e ~/Github/vajra_search_engine --no-build-isolation

One 50k snapshot from that track (fresh run on 2026-03-04):

Engine/Profile Build (s) p50 (ms) QPS Recall@10
ZVec 2.724 0.764 1312.7 0.998
Vajra quality 51.083 0.208 4753.0 0.998
Vajra fast 17.340 0.170 5682.1 0.908
Vajra instant 4.194 0.075 10787.2 0.718

Release Checks

Before publishing to PyPI:

./scripts/release_check.sh

This validates:

  • Python tests
  • coverage threshold (>=80%)
  • Rust workspace tests (with pinned PyO3 interpreter)
  • wheel build + clean-venv install/import smoke test

Release Process

  • Tag push (v*) builds cross-platform wheels/sdist and creates a GitHub Release.
  • PyPI upload is a separate manual action via GitHub workflow (Publish PyPI), using the chosen tag.
  • Full runbook: RELEASING.md

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vajra_search-0.2.1.tar.gz (77.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

vajra_search-0.2.1-cp314-cp314-macosx_11_0_arm64.whl (955.4 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

vajra_search-0.2.1-cp312-cp312-win_amd64.whl (914.3 kB view details)

Uploaded CPython 3.12Windows x86-64

vajra_search-0.2.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file vajra_search-0.2.1.tar.gz.

File metadata

  • Download URL: vajra_search-0.2.1.tar.gz
  • Upload date:
  • Size: 77.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for vajra_search-0.2.1.tar.gz
Algorithm Hash digest
SHA256 d77a0a99483918268400ab973c8206c41444fb6f330add69f4eb093261b6d7da
MD5 0e8bf6dbb9417fe11d89e293733a6aeb
BLAKE2b-256 0955250e3f9e4d41966fdc7754ea9ea22433677c3a3350ee16b2e314e9f5d4e0

See more details on using hashes here.

File details

Details for the file vajra_search-0.2.1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for vajra_search-0.2.1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4e51e7c7861acc4b7bb83437febfbd399ad530bbe78cc277a7e94911527fd068
MD5 ba51d1e55ddbf1531de6f87bf40e1696
BLAKE2b-256 62dae9c1806034f22a3b791d439c7836a844d3ced755ee12f3d0c0b6b5801ef4

See more details on using hashes here.

File details

Details for the file vajra_search-0.2.1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for vajra_search-0.2.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 03b8f5167a82692590e8cc8b2765ebe99221dda58977bd1d2387cbf74a85dee6
MD5 459217ff7cd1165df0891d5299c1e970
BLAKE2b-256 0f958bef0b763260feded4ec04516ce4872269a04b8030cdb0f88675a1a0fc96

See more details on using hashes here.

File details

Details for the file vajra_search-0.2.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for vajra_search-0.2.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9e5d9af8d908c34e0360c9081ab1be2ce1f43222c85189eefb653eba70e59734
MD5 bcd014f40de548dffb832b9518075870
BLAKE2b-256 6a1032673edf79cf94d7caa67a9d61d52f83f4018abb9a35f1256b4a02b7c410

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page