Skip to main content

In-memory nearest neighbor search engine for Python, implemented in Rust.

Project description

nndex

PyPI | Crates.io

A high-performance Rust library with Python bindings for nearest-neighbor vector search with zero configuration necessary that's so fast, it's faster than numpy! This crate leverages the computational trick where if the source vectors and query vectors are all unit-normalized, performing a dot-product — an operation faster than vector distance calculations — returns the cosine similarity.

Features:

  • CPU backend using rayon parallelism + SIMD (via simsimd), along with highly bespoke compute profiles for maximum CPU performance
  • GPU backend using wgpu compute shaders, supporting Vulkan, Metal, D3D12, and OpenGL graphics APIs
  • Approximate nearest-neighbor (ANN) mode with exact reranking by building an IVF index for even faster lookups
  • Batch search for multiple queries at once
  • Python bindings via PyO3 with numpy, pandas, and polars support
  • Load embeddings directly from .npy, .npz, and .parquet files
  • Internal query-result caching for repeated searches

Disclosure: This library was mostly coded with the assistance of Claude Opus 4.6 and GPT-5.3-Codex as research into the discovery that those models can now successfuly hyperoptimize Rust code. However, I personally have reviewed all code to ensure it is accurate, have added numerous tests and benchmarks to ensure it works as both intended and advertised, and have edited documentation and comments to provide greater signal as to how the package operates. I have given this project the same care and attention as I would give a project I have written from scratch.

Installation

Python

pip install nndex
uv pip install nndex

Rust

Add to your Cargo.toml:

[dependencies]
nndex = "0.2.1"

Python Usage

See also the demo notebooks for more interactive examples.

Building an Index

import numpy as np
from nndex import NNdex

rng = np.random.default_rng(42)
matrix = rng.normal(size=(50_000, 128)).astype(np.float32)

# Build an index (auto-selects GPU if available, falls back to CPU)
index = NNdex(matrix)
print(index.backend)  # "gpu" or "cpu"
print(index.rows, index.dims)  # 50000 128

Single Query

search() returns a tuple of (indices, similarities) as numpy arrays, ordered from most similar to least similar.

query = rng.normal(size=(dims,)).astype(np.float32)

indices, scores = index.search(query, k=5)
print(indices)  # [43576 14100 15993 35409 38916]
print(scores)   # [0.360 0.353 0.335 0.332 0.323]

Batch Query

Pass a 2D array to search multiple queries at once more efficienctly than querying one at a time. Returns 2D numpy arrays.

queries = rng.normal(size=(4, dims)).astype(np.float32)

batch_indices, batch_scores = index.search(queries, k=5)
print(batch_indices.shape)  # (4, 5)
print(batch_scores.shape)   # (4, 5)

Approximate Nearest Neighbors

Enable approx=True for sub-millisecond queries on large matrices: Uses a dimensionality-reduced prefilter followed by exact reranking.

index_ann = NNdex(matrix, approx=True)

indices, scores = index_ann.search(query, k=5)

Backend Selection

# CPU (default)
cpu_index = NNdex(matrix, backend="cpu")

# Force GPU
gpu_index = NNdex(matrix, backend="gpu")

Pre-Normalized Data

If your embeddings are already unit-normalized, set normalized=True to skip the internal normalization step for more computational speed:

normalized_matrix = matrix / np.linalg.norm(matrix, axis=1, keepdims=True)
index = NNdex(normalized_matrix, normalized=True)

DataFrame Output

A conveience function which allows you to pass a pandas or polars DataFrame to dataframe= to get results as DataFrames with a similarity column appended. The DataFrame must have the same number of rows as the index.

import pandas as pd

df = pd.DataFrame(matrix, columns=[f"d{i}" for i in range(dims)])
index = NNdex(matrix, backend="cpu")

# Single query: returns a DataFrame
result = index.search(query, k=3, dataframe=df)

# Batch query: returns a list of DataFrames
results = index.search(queries, k=3, dataframe=df)

This also works with polars DataFrames:

import polars as pl

pldf = pl.DataFrame({f"d{i}": matrix[:, i] for i in range(dims)})
result = index.search(query, k=3, dataframe=pldf)

Loading from Disk

NNdex.from_file() loads embeddings directly from .npy, .npz, or .parquet files in Rust, avoiding Python-side deserialization overhead. .npz and .parquet require a key argument.

# .npy (single 2D array)
index = NNdex.from_file("embeddings.npy")

# .npz (keyed archive)
index = NNdex.from_file("embeddings.npz", key="matrix_store")

# .parquet (list/fixed-size-list column of f32)
index = NNdex.from_file("embeddings.parquet", key="embedding")

Rust Usage

use nndex::{NNdex, IndexOptions, BackendPreference, Neighbor};

fn main() -> Result<(), nndex::NNdexError> {
    let matrix = vec![
        1.0_f32, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
        0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
        1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
    ];

    let index = NNdex::new(&matrix, 3, 8, IndexOptions {
        normalized: false,
        approx: false,
        backend: BackendPreference::Cpu,
        ..IndexOptions::default()
    })?;

    let query = vec![0.8_f32, 0.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0];
    let neighbors = index.search(&query, 2)?;

    for neighbor in &neighbors {
        println!("index: {}, similarity: {:.4}", neighbor.index, neighbor.similarity);
    }

    Ok(())
}

Benchmarks

[BENCHMARK IMAGES TO BE ADDED]

Notes

  • nndex is NOT a vector store/database which implies that the vectors can be created/updated/deleted from the matrix, and it is not intending to be. It's intended to be used with a fixed matrix of data, although this crate is so fast that you could reinitialize the NNdex without much overhead if needed.
  • For Apple Silicon in particular, the use of the GPU backend (Metal) is not recommended below 100k rows due to the dispatch overhead of wgpu being greater than the inference speed. This is not the case with discrete GPUs.
  • BLAS is only supported for macOS because the underlying BLAS library (accelerate) is included by default. There are tradeoffs for Linux/Windows and I am still determining what to do there.

API Reference

Python

NNdex(data, ...)

Parameter Type Default Description
data array-like required 2D numpy array, list, or pandas/polars DataFrame
normalized bool False Skip internal normalization if data is already unit-norm
approx bool False Enable ANN prefiltering with exact reranking
backend str "cpu" "cpu" or "gpu"
enable_cache bool True Cache repeated query results

NNdex.from_file(path, ...)

Parameter Type Default Description
path str required Path to .npy, .npz, or .parquet file
key str/None None Array key (.npz) or column name (.parquet)
normalized bool False Skip internal normalization if data is already unit-norm
approx bool False Enable ANN prefiltering with exact reranking
backend str "cpu" "cpu" or "gpu"
enable_cache bool True Cache repeated query results

index.search(query, k=10, dataframe=None)

Parameter Type Default Description
query array-like required 1D vector or 2D matrix of queries
k int 10 Number of neighbors to return per query
dataframe DataFrame/None None Source DataFrame for output; must match index row count

Returns: Without dataframe: tuple of (indices, similarities) as numpy arrays (1D for single query, 2D for batch). With dataframe: a DataFrame (single query) or list of DataFrames (batch) with a similarity column.

Properties

Property Type Description
backend str Active backend ("cpu"/"gpu")
rows int Number of indexed rows
dims int Number of dimensions per row

Rust

NNdex::new(matrix, rows, dims, options)

Constructs an index from a flattened row-major &[f32] matrix.

index.search(query, k) -> Result<Vec<Neighbor>>

Returns top-k neighbors sorted by descending cosine similarity.

index.search_batch(queries, query_rows, k) -> Result<Vec<Vec<Neighbor>>>

Batch search over multiple query vectors.

IndexOptions

Field Type Default Description
normalized bool false Skip normalization for pre-normalized data
approx bool false Enable ANN prefiltering
backend BackendPreference Cpu Cpu, or Gpu
enable_cache bool true Cache constructor and query results

Maintainer/Creator

Max Woolf (@minimaxir)

Max's open-source projects are supported by his Patreon and GitHub Sponsors. If you found this project helpful, any monetary contributions to the Patreon are appreciated and will be put to good creative use.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nndex-0.2.1.tar.gz (100.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nndex-0.2.1-cp310-abi3-win_amd64.whl (4.9 MB view details)

Uploaded CPython 3.10+Windows x86-64

nndex-0.2.1-cp310-abi3-manylinux_2_35_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.35+ x86-64

nndex-0.2.1-cp310-abi3-manylinux_2_35_aarch64.whl (4.9 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.35+ ARM64

nndex-0.2.1-cp310-abi3-macosx_11_0_arm64.whl (4.0 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

nndex-0.2.1-cp310-abi3-macosx_10_12_x86_64.whl (4.5 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file nndex-0.2.1.tar.gz.

File metadata

  • Download URL: nndex-0.2.1.tar.gz
  • Upload date:
  • Size: 100.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nndex-0.2.1.tar.gz
Algorithm Hash digest
SHA256 9bf8cd99d86fb364d88f51572eb3d5a024dd9a38e8e8ea2575eb3273cf68e111
MD5 ee57f358132bc8c881e092dbb8130abc
BLAKE2b-256 5be74e6ea90cfabbfb550ea56c3c1be7612302a0626bd1a64c950a0b4287a069

See more details on using hashes here.

File details

Details for the file nndex-0.2.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: nndex-0.2.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 4.9 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nndex-0.2.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 ec6f666c81b0106bed3fee0e7e0e3e5d709caf7f4a0168da8c1ac8207cc78b1d
MD5 963be8a48f8f153c808872185655778f
BLAKE2b-256 11c4cca8f11434f1222ccf677a4c6fde865a72d5576fce6e385a48efc3f52772

See more details on using hashes here.

File details

Details for the file nndex-0.2.1-cp310-abi3-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for nndex-0.2.1-cp310-abi3-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 0ce7137fdcf6c31b574f1b379ac1716d384b477e20dd0e478009e7a26e13ade0
MD5 0a0f13631f11051d8d0e4cf7fad75412
BLAKE2b-256 d8fc89b32f2ef52615eeed9fb93c64d999ee8b5f1fea6ffa7714ace95290882f

See more details on using hashes here.

File details

Details for the file nndex-0.2.1-cp310-abi3-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for nndex-0.2.1-cp310-abi3-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 948fe2713d34caf7c9e1b1f9366d49932857244ec2636001ae6a2dbbc07cf603
MD5 cd15435307ba422016c0725681668dc0
BLAKE2b-256 749309c5872e3284ab37c7eb231c1458847a74578b2580dc92bc20ff22b1c257

See more details on using hashes here.

File details

Details for the file nndex-0.2.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for nndex-0.2.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6bf0054921e3add0165f541e61c93d2ebc263d6abd2484c79d3b6e5b431a4295
MD5 4e8078382db09a7c3a623f848e745b9e
BLAKE2b-256 bc37abac4a1c1a416ba3e558679643598ee33b1f50c1e2234142f83fc5d078a8

See more details on using hashes here.

File details

Details for the file nndex-0.2.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for nndex-0.2.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 8c627d3b90806a46d5aa29d1b1e0d71e4216afe9f40de520286cb50a49202e4b
MD5 fb1cc327a9c76f03eb0eb216c0a2502c
BLAKE2b-256 2e4071f65438257fa64a1a94c4bffc8fe42c985ed0015057c16b4bbdea76651b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page