A lightweight library for nearest neighbor search in Rust.
Project description
Overview
NilVec is a high-performance, memory-efficient vector search library designed to handle both embeddings and associated metadata without compromising query accuracy or speed. By decoupling metadata from the core embedding data during distance calculations, NilVec ensures that search accuracy remains high while keeping memory overhead minimal.
In our benchmarks, NilVec achieved a 95.5% improvement on query latency compared to leading solutions like Chroma, making it an excellent choice for real-time applications and large-scale search deployments.
Key Features
-
Memory Efficiency: NilVec stores vectors in a contiguous block of memory and tracks metadata separately, avoiding unnecessary duplication and overhead.
-
High Performance: Benchmarked to deliver a 95.5% improvement in query latency over comparable systems, ensuring rapid search responses.
-
Flexible and Ergonomic API: Built in Rust with a Python interface, NilVec supports simple operations for inserting vectors, searching, and bulk index creation—all while handling metadata seamlessly.
How It Works
NilVec separates the embedding components from metadata so that only the core vector elements contribute to distance calculations. Metadata is stored in parallel and associated via a schema that maps attribute names (as Strings) to their corresponding positions in the metadata array. This design guarantees that metadata does not interfere with the accuracy of nearest neighbor searches.
Benchmarks
Our benchmarks compare NilVec with Chroma using the following setup:
- Configuration:
- Dimension: 10
- Number of insertions: 100 vectors
- Number of queries: 10 queries with metadata filtering
- Results:
- NilVec demonstrated a 95.5% improvement on query latency compared to Chroma.
- Insertion latency is also highly optimized, ensuring minimal overhead during data ingestion.
Below is an excerpt from our benchmark script:
import time
import random
import numpy as np
import nilvec
import chromadb
# Configuration
dim = 10
num_inserts = 100
num_queries = 10
categories = ["news", "blog", "report"]
# --- Chroma Benchmark ---
chroma_query_times = []
for i in range(num_queries):
query = [random.random() for _ in range(dim)]
filter_category = random.choice(categories)
start_time = time.perf_counter()
# Execute query on Chroma...
elapsed = time.perf_counter() - start_time
chroma_query_times.append(elapsed)
# --- NilVec Benchmark ---
nilvec_query_times = []
hnsw = nilvec.PyHNSW(dim, None, None, None, None, "inner_product", ["category"])
for i in range(num_queries):
query = [random.random() for _ in range(dim)]
filter_category = random.choice(categories)
start_time = time.perf_counter()
results = hnsw.search(query, 5, ("category", filter_category))
elapsed = time.perf_counter() - start_time
nilvec_query_times.append(elapsed)
Usage
Installation
NilVec is distributed as a Python package via its PyO3 bindings. You can install it using pip:
pip install nilvec
Examples
Below is a quick example of how to use NilVec in your Python project:
import nilvec
# Create an index with dimension 128 using inner product as the metric.
# Optionally, you can provide a schema for metadata.
index = nilvec.PyHNSW(128, None, None, None, None, "inner_product", ["color", "size"])
# Insert a vector with associated metadata.
vector = [0.1] * 128
metadata = [("color", "blue"), ("size", 42)]
index.insert(vector, metadata)
# Perform a search query with metadata filtering.
query = [0.1] * 128
results = index.search(query, k=5, filter=("color", "blue"))
for distance, vector in results:
print("Distance:", distance, "Vector:", vector)
# Alternatively, bulk-create an index from a list of vectors.
vectors = [
[0.1] * 128,
[0.2] * 128,
[0.3] * 128
]
index.create(vectors)
Testing
To run the NilVec test suite, execute:
cargo test
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nilvec-0.1.5.tar.gz.
File metadata
- Download URL: nilvec-0.1.5.tar.gz
- Upload date:
- Size: 491.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17965dd83f7488c90b603ffdeb650e5ce4901d471738454d1aa7508018cb7257
|
|
| MD5 |
3991ed510239e99517b5edfecaed2e85
|
|
| BLAKE2b-256 |
007c680c7b198baf7ebed3e3b8c4562708e2c8cf4b6828386955f069048daedf
|
File details
Details for the file nilvec-0.1.5-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: nilvec-0.1.5-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 287.3 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97744158fe4be852d90400ee9adc11d56a3992e56a67430e0f9072ec42d10983
|
|
| MD5 |
171f3b6c9e2c7a3a4b4dde547c3728ab
|
|
| BLAKE2b-256 |
21de8b6e056007bb742bb8f463d70f0060cf1275363d4d66873f854fa543fc67
|