Skip to main content

In-memory nearest neighbor search engine for Python, implemented in Rust.

Project description

nndex

PyPI | Crates.io

A high-performance Rust library with Python bindings for nearest-neighbor vector search with zero configuration necessary that's so fast, it's faster than numpy! This crate leverages the computational trick where if the source vectors and query vectors are all unit-normalized, performing a dot-product — an operation faster than vector distance calculations — returns the cosine similarity.

Features:

  • CPU backend using rayon parallelism + SIMD (via simsimd), along with highly bespoke compute profiles for maximum CPU performance
  • GPU backend using wgpu compute shaders, supporting Vulkan, Metal, D3D12, and OpenGL graphics APIs
  • Approximate nearest-neighbor (ANN) mode with exact reranking by building an IVF index for even faster lookups
  • Batch search for multiple queries at once
  • Python bindings via PyO3 with numpy, pandas, and polars support
  • Load embeddings directly from .npy, .npz, and .parquet files
  • Internal query-result caching for repeated searches

Disclosure: This library was mostly coded with the assistance of Claude Opus 4.6 and GPT-5.3-Codex as research into the discovery that those models can now successfuly hyperoptimize Rust code. However, I personally have reviewed all code to ensure it is accurate, have added numerous tests and benchmarks to ensure it works as both intended and advertised, and have edited documentation and comments to provide greater signal as to how the package operates. I have given this project the same care and attention as I would give a project I have written from scratch.

Installation

Python

pip install nndex
uv pip install nndex

Rust

Add to your Cargo.toml:

[dependencies]
nndex = "0.2.0"

Python Usage

See also the demo notebooks for more interactive examples.

Building an Index

import numpy as np
from nndex import NNdex

rng = np.random.default_rng(42)
matrix = rng.normal(size=(50_000, 128)).astype(np.float32)

# Build an index (auto-selects GPU if available, falls back to CPU)
index = NNdex(matrix)
print(index.backend)  # "gpu" or "cpu"
print(index.rows, index.dims)  # 50000 128

Single Query

search() returns a tuple of (indices, similarities) as numpy arrays, ordered from most similar to least similar.

query = rng.normal(size=(dims,)).astype(np.float32)

indices, scores = index.search(query, k=5)
print(indices)  # [43576 14100 15993 35409 38916]
print(scores)   # [0.360 0.353 0.335 0.332 0.323]

Batch Query

Pass a 2D array to search multiple queries at once more efficienctly than querying one at a time. Returns 2D numpy arrays.

queries = rng.normal(size=(4, dims)).astype(np.float32)

batch_indices, batch_scores = index.search(queries, k=5)
print(batch_indices.shape)  # (4, 5)
print(batch_scores.shape)   # (4, 5)

Approximate Nearest Neighbors

Enable approx=True for sub-millisecond queries on large matrices (>10000 rows); on smaller matrices, this setting may be ignored due to overhead causing slowdown instead. Uses a dimensionality-reduced prefilter followed by exact reranking.

index_ann = NNdex(matrix, approx=True)

indices, scores = index_ann.search(query, k=5)

Backend Selection

# CPU (default)
cpu_index = NNdex(matrix, backend="cpu")

# Force GPU
gpu_index = NNdex(matrix, backend="gpu")

Pre-Normalized Data

If your embeddings are already unit-normalized, set normalized=True to skip the internal normalization step for more computational speed:

normalized_matrix = matrix / np.linalg.norm(matrix, axis=1, keepdims=True)
index = NNdex(normalized_matrix, normalized=True)

DataFrame Output

A conveience function which allows you to pass a pandas or polars DataFrame to dataframe= to get results as DataFrames with a similarity column appended. The DataFrame must have the same number of rows as the index.

import pandas as pd

df = pd.DataFrame(matrix, columns=[f"d{i}" for i in range(dims)])
index = NNdex(matrix, backend="cpu")

# Single query: returns a DataFrame
result = index.search(query, k=3, dataframe=df)

# Batch query: returns a list of DataFrames
results = index.search(queries, k=3, dataframe=df)

This also works with polars DataFrames:

import polars as pl

pldf = pl.DataFrame({f"d{i}": matrix[:, i] for i in range(dims)})
result = index.search(query, k=3, dataframe=pldf)

Loading from Disk

NNdex.from_file() loads embeddings directly from .npy, .npz, or .parquet files in Rust, avoiding Python-side deserialization overhead. .npz and .parquet require a key argument.

# .npy (single 2D array)
index = NNdex.from_file("embeddings.npy")

# .npz (keyed archive)
index = NNdex.from_file("embeddings.npz", key="matrix_store")

# .parquet (list/fixed-size-list column of f32)
index = NNdex.from_file("embeddings.parquet", key="embedding")

Rust Usage

use nndex::{NNdex, IndexOptions, BackendPreference, Neighbor};

fn main() -> Result<(), nndex::NNdexError> {
    let matrix = vec![
        1.0_f32, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
        0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
        1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
    ];

    let index = NNdex::new(&matrix, 3, 8, IndexOptions {
        normalized: false,
        approx: false,
        backend: BackendPreference::Cpu,
        ..IndexOptions::default()
    })?;

    let query = vec![0.8_f32, 0.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0];
    let neighbors = index.search(&query, 2)?;

    for neighbor in &neighbors {
        println!("index: {}, similarity: {:.4}", neighbor.index, neighbor.similarity);
    }

    Ok(())
}

Benchmarks

[BENCHMARK IMAGES TO BE ADDED]

Notes

  • nndex is NOT a vector store/database which implies that the vectors can be created/updated/deleted from the matrix, and it is not intending to be. It's intended to be used with a fixed matrix of data, although this crate is so fast that you could reinitialize the NNdex without much overhead if needed.
  • For Apple Silicon in particular, the use of the GPU backend (Metal) is not recommended due to the dispatch overhead of wgpu being greater than the inference speed. This is not the case with discrete GPUs.
  • BLAS is only supported for macOS because the underlying BLAS library (accelerate) is included by default. There are tradeoffs for Linux/Windows and I am still determining what to do there.

API Reference

Python

NNdex(data, ...)

Parameter Type Default Description
data array-like required 2D numpy array, list, or pandas/polars DataFrame
normalized bool False Skip internal normalization if data is already unit-norm
approx bool False Enable ANN prefiltering with exact reranking
backend str "cpu" "cpu" or "gpu"
enable_cache bool True Cache repeated query results

NNdex.from_file(path, ...)

Parameter Type Default Description
path str required Path to .npy, .npz, or .parquet file
key str/None None Array key (.npz) or column name (.parquet)
normalized bool False Skip internal normalization if data is already unit-norm
approx bool False Enable ANN prefiltering with exact reranking
backend str "cpu" "cpu" or "gpu"
enable_cache bool True Cache repeated query results

index.search(query, k=10, dataframe=None)

Parameter Type Default Description
query array-like required 1D vector or 2D matrix of queries
k int 10 Number of neighbors to return per query
dataframe DataFrame/None None Source DataFrame for output; must match index row count

Returns: Without dataframe: tuple of (indices, similarities) as numpy arrays (1D for single query, 2D for batch). With dataframe: a DataFrame (single query) or list of DataFrames (batch) with a similarity column.

Properties

Property Type Description
backend str Active backend ("cpu"/"gpu")
rows int Number of indexed rows
dims int Number of dimensions per row

Rust

NNdex::new(matrix, rows, dims, options)

Constructs an index from a flattened row-major &[f32] matrix.

index.search(query, k) -> Result<Vec<Neighbor>>

Returns top-k neighbors sorted by descending cosine similarity.

index.search_batch(queries, query_rows, k) -> Result<Vec<Vec<Neighbor>>>

Batch search over multiple query vectors.

IndexOptions

Field Type Default Description
normalized bool false Skip normalization for pre-normalized data
approx bool false Enable ANN prefiltering
backend BackendPreference Cpu Cpu, or Gpu
enable_cache bool true Cache constructor and query results

Maintainer/Creator

Max Woolf (@minimaxir)

Max's open-source projects are supported by his Patreon and GitHub Sponsors. If you found this project helpful, any monetary contributions to the Patreon are appreciated and will be put to good creative use.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nndex-0.2.0.tar.gz (99.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nndex-0.2.0-py3-none-win_amd64.whl (64.9 kB view details)

Uploaded Python 3Windows x86-64

nndex-0.2.0-py3-none-manylinux_2_34_x86_64.whl (161.4 kB view details)

Uploaded Python 3manylinux: glibc 2.34+ x86-64

nndex-0.2.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (9.1 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

nndex-0.2.0-py3-none-macosx_11_0_arm64.whl (7.5 kB view details)

Uploaded Python 3macOS 11.0+ ARM64

nndex-0.2.0-py3-none-macosx_10_12_x86_64.whl (7.2 kB view details)

Uploaded Python 3macOS 10.12+ x86-64

File details

Details for the file nndex-0.2.0.tar.gz.

File metadata

  • Download URL: nndex-0.2.0.tar.gz
  • Upload date:
  • Size: 99.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nndex-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2f6204eab1514b1e53cfa11bf6a7895d83577ce15510b1ef06ecdb1e7efd1c15
MD5 d31b0d41d3a5a56742cedde7bd5e477b
BLAKE2b-256 1685976fb13f7fd68e9afd24a961cf847b4ba3ea17d6528ae189ba477b655721

See more details on using hashes here.

File details

Details for the file nndex-0.2.0-py3-none-win_amd64.whl.

File metadata

  • Download URL: nndex-0.2.0-py3-none-win_amd64.whl
  • Upload date:
  • Size: 64.9 kB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nndex-0.2.0-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 acd1d55497801e0aae2877a4c995eea8ec66c2386ceb2be30727c6640bc06321
MD5 921b92238693f377b7094b0a0bbebc80
BLAKE2b-256 d26c262d3485e8c1bec0e5edb139ad636e69f4cf7073631a2e50d99b289230e6

See more details on using hashes here.

File details

Details for the file nndex-0.2.0-py3-none-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for nndex-0.2.0-py3-none-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 dba1762e3221b999a7e8c8cbb8f0512e5caa4c3837ffc435df1a9cd86b516601
MD5 f29c6c4eb8f925704363070a2fcf15d2
BLAKE2b-256 25376156655dfce84d9b61b654861e6e98f07a4bc58cc10d2b854c5c25c39b47

See more details on using hashes here.

File details

Details for the file nndex-0.2.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for nndex-0.2.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 447a1e72dd91a8893108446cd2908c506085065e60336d3cc3fe914ff76c578d
MD5 bacd8125b3c97394c8f678efe1cf57b0
BLAKE2b-256 655abaa430e733194a6817893bc7d58849418595939c61bcfd08abb881bd77a5

See more details on using hashes here.

File details

Details for the file nndex-0.2.0-py3-none-macosx_11_0_arm64.whl.

File metadata

  • Download URL: nndex-0.2.0-py3-none-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nndex-0.2.0-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e6c942e0007f6fb5ae04d38014b84eca81e8147bd63edaf6b5055a70c98c9899
MD5 57b0041887df423377957b258ff0181d
BLAKE2b-256 4c088e13e69804dab9b1424dd204ec1332fbf9ab8facd01ff301c4e84ad76d47

See more details on using hashes here.

File details

Details for the file nndex-0.2.0-py3-none-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for nndex-0.2.0-py3-none-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ac484203ef84ac68c7fde0d700f05aef3e73aa7119606db6f4a390b0c03f17c1
MD5 325a0eb68b79043a92466d06a57ef9c2
BLAKE2b-256 c6dd3123b5f68bb01b26d1655f65ca03008cef6affdc5feb83d45d9c126abad7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page