In-memory nearest neighbor search engine for Python, implemented in Rust.
Project description
nndex
A high-performance Rust library with Python bindings for nearest-neighbor vector search with zero configuration necessary that's so fast, it's faster than numpy! This crate leverages the computational trick where if the source vectors and query vectors are all unit-normalized, performing a dot-product — an operation faster than vector distance calculations — returns the cosine similarity.
Features:
- CPU backend using rayon parallelism + SIMD (via simsimd), along with highly bespoke compute profiles for maximum CPU performance
- GPU backend using wgpu compute shaders, supporting Vulkan, Metal, D3D12, and OpenGL graphics APIs
- Approximate nearest-neighbor (ANN) mode with exact reranking by building an IVF index for even faster lookups
- Batch search for multiple queries at once
- Python bindings via PyO3 with numpy, pandas, and polars support
- Load embeddings directly from
.npy,.npz, and.parquetfiles - Internal query-result caching for repeated searches
Disclosure: This library was mostly coded with the assistance of Claude Opus 4.6 and GPT-5.3-Codex as research into the discovery that those models can now successfuly hyperoptimize Rust code. However, I personally have reviewed all code to ensure it is accurate, have added numerous tests and benchmarks to ensure it works as both intended and advertised, and have edited documentation and comments to provide greater signal as to how the package operates. I have given this project the same care and attention as I would give a project I have written from scratch.
Installation
Python
pip install nndex
uv pip install nndex
Rust
Add to your Cargo.toml:
[dependencies]
nndex = "0.2.1"
Python Usage
See also the demo notebooks for more interactive examples.
Building an Index
import numpy as np
from nndex import NNdex
rng = np.random.default_rng(42)
matrix = rng.normal(size=(50_000, 128)).astype(np.float32)
# Build an index (auto-selects GPU if available, falls back to CPU)
index = NNdex(matrix)
print(index.backend) # "gpu" or "cpu"
print(index.rows, index.dims) # 50000 128
Single Query
search() returns a tuple of (indices, similarities) as numpy arrays, ordered from most similar to least similar.
query = rng.normal(size=(dims,)).astype(np.float32)
indices, scores = index.search(query, k=5)
print(indices) # [43576 14100 15993 35409 38916]
print(scores) # [0.360 0.353 0.335 0.332 0.323]
Batch Query
Pass a 2D array to search multiple queries at once more efficienctly than querying one at a time. Returns 2D numpy arrays.
queries = rng.normal(size=(4, dims)).astype(np.float32)
batch_indices, batch_scores = index.search(queries, k=5)
print(batch_indices.shape) # (4, 5)
print(batch_scores.shape) # (4, 5)
Approximate Nearest Neighbors
Enable approx=True for sub-millisecond queries on large matrices: Uses a dimensionality-reduced prefilter followed by exact reranking.
index_ann = NNdex(matrix, approx=True)
indices, scores = index_ann.search(query, k=5)
Backend Selection
# CPU (default)
cpu_index = NNdex(matrix, backend="cpu")
# Force GPU
gpu_index = NNdex(matrix, backend="gpu")
Pre-Normalized Data
If your embeddings are already unit-normalized, set normalized=True to skip the internal normalization step for more computational speed:
normalized_matrix = matrix / np.linalg.norm(matrix, axis=1, keepdims=True)
index = NNdex(normalized_matrix, normalized=True)
DataFrame Output
A conveience function which allows you to pass a pandas or polars DataFrame to dataframe= to get results as DataFrames with a similarity column appended. The DataFrame must have the same number of rows as the index.
import pandas as pd
df = pd.DataFrame(matrix, columns=[f"d{i}" for i in range(dims)])
index = NNdex(matrix, backend="cpu")
# Single query: returns a DataFrame
result = index.search(query, k=3, dataframe=df)
# Batch query: returns a list of DataFrames
results = index.search(queries, k=3, dataframe=df)
This also works with polars DataFrames:
import polars as pl
pldf = pl.DataFrame({f"d{i}": matrix[:, i] for i in range(dims)})
result = index.search(query, k=3, dataframe=pldf)
Loading from Disk
NNdex.from_file() loads embeddings directly from .npy, .npz, or .parquet files in Rust, avoiding Python-side deserialization overhead. .npz and .parquet require a key argument.
# .npy (single 2D array)
index = NNdex.from_file("embeddings.npy")
# .npz (keyed archive)
index = NNdex.from_file("embeddings.npz", key="matrix_store")
# .parquet (list/fixed-size-list column of f32)
index = NNdex.from_file("embeddings.parquet", key="embedding")
Rust Usage
use nndex::{NNdex, IndexOptions, BackendPreference, Neighbor};
fn main() -> Result<(), nndex::NNdexError> {
let matrix = vec![
1.0_f32, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
];
let index = NNdex::new(&matrix, 3, 8, IndexOptions {
normalized: false,
approx: false,
backend: BackendPreference::Cpu,
..IndexOptions::default()
})?;
let query = vec![0.8_f32, 0.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0];
let neighbors = index.search(&query, 2)?;
for neighbor in &neighbors {
println!("index: {}, similarity: {:.4}", neighbor.index, neighbor.similarity);
}
Ok(())
}
Benchmarks
[BENCHMARK IMAGES TO BE ADDED]
Notes
- nndex is NOT a vector store/database which implies that the vectors can be created/updated/deleted from the matrix, and it is not intending to be. It's intended to be used with a fixed matrix of data, although this crate is so fast that you could reinitialize the
NNdexwithout much overhead if needed. - For Apple Silicon in particular, the use of the GPU backend (Metal) is not recommended below 100k rows due to the dispatch overhead of
wgpubeing greater than the inference speed. This is not the case with discrete GPUs. - BLAS is only supported for macOS because the underlying BLAS library (accelerate) is included by default. There are tradeoffs for Linux/Windows and I am still determining what to do there.
API Reference
Python
NNdex(data, ...)
| Parameter | Type | Default | Description |
|---|---|---|---|
data |
array-like | required | 2D numpy array, list, or pandas/polars DataFrame |
normalized |
bool | False |
Skip internal normalization if data is already unit-norm |
approx |
bool | False |
Enable ANN prefiltering with exact reranking |
backend |
str | "cpu" |
"cpu" or "gpu" |
enable_cache |
bool | True |
Cache repeated query results |
NNdex.from_file(path, ...)
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
str | required | Path to .npy, .npz, or .parquet file |
key |
str/None | None |
Array key (.npz) or column name (.parquet) |
normalized |
bool | False |
Skip internal normalization if data is already unit-norm |
approx |
bool | False |
Enable ANN prefiltering with exact reranking |
backend |
str | "cpu" |
"cpu" or "gpu" |
enable_cache |
bool | True |
Cache repeated query results |
index.search(query, k=10, dataframe=None)
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
array-like | required | 1D vector or 2D matrix of queries |
k |
int | 10 |
Number of neighbors to return per query |
dataframe |
DataFrame/None | None |
Source DataFrame for output; must match index row count |
Returns: Without dataframe: tuple of (indices, similarities) as numpy arrays (1D for single query, 2D for batch). With dataframe: a DataFrame (single query) or list of DataFrames (batch) with a similarity column.
Properties
| Property | Type | Description |
|---|---|---|
backend |
str | Active backend ("cpu"/"gpu") |
rows |
int | Number of indexed rows |
dims |
int | Number of dimensions per row |
Rust
NNdex::new(matrix, rows, dims, options)
Constructs an index from a flattened row-major &[f32] matrix.
index.search(query, k) -> Result<Vec<Neighbor>>
Returns top-k neighbors sorted by descending cosine similarity.
index.search_batch(queries, query_rows, k) -> Result<Vec<Vec<Neighbor>>>
Batch search over multiple query vectors.
IndexOptions
| Field | Type | Default | Description |
|---|---|---|---|
normalized |
bool |
false |
Skip normalization for pre-normalized data |
approx |
bool |
false |
Enable ANN prefiltering |
backend |
BackendPreference |
Cpu |
Cpu, or Gpu |
enable_cache |
bool |
true |
Cache constructor and query results |
Maintainer/Creator
Max Woolf (@minimaxir)
Max's open-source projects are supported by his Patreon and GitHub Sponsors. If you found this project helpful, any monetary contributions to the Patreon are appreciated and will be put to good creative use.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nndex-0.2.1.tar.gz.
File metadata
- Download URL: nndex-0.2.1.tar.gz
- Upload date:
- Size: 100.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9bf8cd99d86fb364d88f51572eb3d5a024dd9a38e8e8ea2575eb3273cf68e111
|
|
| MD5 |
ee57f358132bc8c881e092dbb8130abc
|
|
| BLAKE2b-256 |
5be74e6ea90cfabbfb550ea56c3c1be7612302a0626bd1a64c950a0b4287a069
|
File details
Details for the file nndex-0.2.1-cp310-abi3-win_amd64.whl.
File metadata
- Download URL: nndex-0.2.1-cp310-abi3-win_amd64.whl
- Upload date:
- Size: 4.9 MB
- Tags: CPython 3.10+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec6f666c81b0106bed3fee0e7e0e3e5d709caf7f4a0168da8c1ac8207cc78b1d
|
|
| MD5 |
963be8a48f8f153c808872185655778f
|
|
| BLAKE2b-256 |
11c4cca8f11434f1222ccf677a4c6fde865a72d5576fce6e385a48efc3f52772
|
File details
Details for the file nndex-0.2.1-cp310-abi3-manylinux_2_35_x86_64.whl.
File metadata
- Download URL: nndex-0.2.1-cp310-abi3-manylinux_2_35_x86_64.whl
- Upload date:
- Size: 5.2 MB
- Tags: CPython 3.10+, manylinux: glibc 2.35+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ce7137fdcf6c31b574f1b379ac1716d384b477e20dd0e478009e7a26e13ade0
|
|
| MD5 |
0a0f13631f11051d8d0e4cf7fad75412
|
|
| BLAKE2b-256 |
d8fc89b32f2ef52615eeed9fb93c64d999ee8b5f1fea6ffa7714ace95290882f
|
File details
Details for the file nndex-0.2.1-cp310-abi3-manylinux_2_35_aarch64.whl.
File metadata
- Download URL: nndex-0.2.1-cp310-abi3-manylinux_2_35_aarch64.whl
- Upload date:
- Size: 4.9 MB
- Tags: CPython 3.10+, manylinux: glibc 2.35+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
948fe2713d34caf7c9e1b1f9366d49932857244ec2636001ae6a2dbbc07cf603
|
|
| MD5 |
cd15435307ba422016c0725681668dc0
|
|
| BLAKE2b-256 |
749309c5872e3284ab37c7eb231c1458847a74578b2580dc92bc20ff22b1c257
|
File details
Details for the file nndex-0.2.1-cp310-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: nndex-0.2.1-cp310-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.10+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bf0054921e3add0165f541e61c93d2ebc263d6abd2484c79d3b6e5b431a4295
|
|
| MD5 |
4e8078382db09a7c3a623f848e745b9e
|
|
| BLAKE2b-256 |
bc37abac4a1c1a416ba3e558679643598ee33b1f50c1e2234142f83fc5d078a8
|
File details
Details for the file nndex-0.2.1-cp310-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: nndex-0.2.1-cp310-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 4.5 MB
- Tags: CPython 3.10+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c627d3b90806a46d5aa29d1b1e0d71e4216afe9f40de520286cb50a49202e4b
|
|
| MD5 |
fb1cc327a9c76f03eb0eb216c0a2502c
|
|
| BLAKE2b-256 |
2e4071f65438257fa64a1a94c4bffc8fe42c985ed0015057c16b4bbdea76651b
|