Skip to main content

HNSW Approximate Nearest Neighbors in Rust, based on LMDB and optimized for memory usage

Project description

hannoy 🗼

License Crates.io dependency status Build CodSpeed Badge

hannoy is a key-value backed HNSW implementation based on arroy.

Motivation

Many popular HNSW libraries are built in memory, meaning you need enough RAM to store all the vectors you're indexing. Instead, hannoy uses LMDB — a memory-mapped KV store — as a storage backend. This is more well-suited for machines running multiple programs, or cases where the dataset you're indexing won't fit in memory. LMDB also supports non-blocking concurrent reads by design, meaning its safe to query the index in multi-threaded environments.

Features

  • Supported metrics: euclidean, cosine, manhattan, hamming, as well as quantized counterparts.
  • Python bindings with maturin and pyo3
  • Multithreaded builds using rayon
  • Disk-backed storage to enable indexing datasets that won't fit in RAM using LMDB
  • Compressed bitmaps to store graph edges with minimal overhead, adding ~200 bytes per vector
  • Dynamic document insertions and deletions without full re-indexing

Missing Features

  • GPU-accelerated indexing

Usage

Rust 🦀

use hannoy::{distances::Cosine, Database, Reader, Result, Writer};
use heed::EnvOpenOptions;
use rand::{rngs::StdRng, SeedableRng};

fn main() -> Result<()> {
    const DIM: usize = 3;
    let vecs: Vec<[f32; DIM]> = vec![[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]];

    let env = unsafe {
        EnvOpenOptions::new()
            .map_size(1024 * 1024 * 1024 * 1) // 1GiB
            .open("./")
    }
    .unwrap();

    let mut wtxn = env.write_txn().unwrap();
    let db: Database<Cosine> = env.create_database(&mut wtxn, None)?;
    let writer: Writer<Cosine> = Writer::new(db, 0, DIM);

    // insert into lmdb
    writer.add_item(&mut wtxn, 0, &vecs[0])?;
    writer.add_item(&mut wtxn, 1, &vecs[1])?;
    writer.add_item(&mut wtxn, 2, &vecs[2])?;

    // ...and build hnsw
    let mut rng = StdRng::seed_from_u64(42);

    let mut builder = writer.builder(&mut rng);
    builder.ef_construction(100).build::<16,32>(&mut wtxn)?;
    wtxn.commit()?;

    // search hnsw using a new lmdb read transaction
    let rtxn = env.read_txn()?;
    let reader = Reader::<Cosine>::open(&rtxn, 0, db)?;

    let query = vec![0.0, 1.0, 0.0];
    let nns = reader.nns(1).ef_search(10).by_vector(&rtxn, &query)?;

    dbg!("{:?}", &nns);
    Ok(())
}

Python 🐍

import hannoy
from hannoy import Metric
import tempfile

tmp_dir = tempfile.gettempdir()
db = hannoy.Database(tmp_dir, Metric.COSINE)

with db.writer(3, m=4, ef=10) as writer:
    writer.add_item(0, [1.0, 0.0, 0.0])
    writer.add_item(1, [0.0, 1.0, 0.0])
    writer.add_item(2, [0.0, 0.0, 1.0])

reader = db.reader()
nns = reader.by_vec([0.0, 1.0, 0.0], n=2)

(closest, dist) = nns[0]

Tips and tricks

Reducing cold start latencies

Search in an hnsw always traverses from the top to bottom layers of the graph, so we know a priori some vectors will be needed. We can hint to the kernel that these vectors (and their neighbours) should be loaded into RAM using madvise to speed up search.

Doing so can reduce cold-start latencies by several milliseconds, and is configured through the HANNOY_READER_PREFETCH_MEMORY environment variable.

E.g. prefetching 10MiB of vectors into RAM.

export HANNOY_READER_PREFETCH_MEMORY=10485760

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hannoy-0.0.6.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hannoy-0.0.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

hannoy-0.0.6-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

hannoy-0.0.6-cp313-cp313-macosx_11_0_arm64.whl (989.6 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

hannoy-0.0.6-cp313-cp313-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

hannoy-0.0.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

hannoy-0.0.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

hannoy-0.0.6-cp312-cp312-macosx_11_0_arm64.whl (989.3 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

hannoy-0.0.6-cp312-cp312-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

hannoy-0.0.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

hannoy-0.0.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

hannoy-0.0.6-cp311-cp311-macosx_11_0_arm64.whl (993.1 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

hannoy-0.0.6-cp311-cp311-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

hannoy-0.0.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

hannoy-0.0.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

hannoy-0.0.6-cp310-cp310-macosx_11_0_arm64.whl (992.5 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

hannoy-0.0.6-cp310-cp310-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file hannoy-0.0.6.tar.gz.

File metadata

  • Download URL: hannoy-0.0.6.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.4

File hashes

Hashes for hannoy-0.0.6.tar.gz
Algorithm Hash digest
SHA256 53bfef12851a7b4a8342d232bedb7771a5d1045166e0edb2d73daa8e34b218a8
MD5 af793e600ca7a4e7f00adc2a0c4fb173
BLAKE2b-256 c22c9de24ebd7eb1f4599328fa41736b3d4a8aec1bc7bc5590b32912fc94a1f9

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b601a605bdd6b39c439c02f2ddfb7a6754aeeda6e8fa284de129a4afd14cd3c5
MD5 ebc336e9f3f70393fe499849e71f745c
BLAKE2b-256 c8d6b47574b7f7a75a685628c107bd81006319eeb70898c0fce52d59d160fbee

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 c1daa9737b35dab39c2a89855d4cb0a3a458eb861ae1af5cdd2dd568eddf3c2a
MD5 11cb8476bcbfc338da745e9bc5dd8735
BLAKE2b-256 1bacce9114887457df5e2488b417d57f976834f40a4d4ea87fce7e38e19f8ade

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 042d56c54932a64d414e5ce5a2a220ec1982466aeabb855b3f11a26d95505076
MD5 ad5397b350383314a50c82261417cc23
BLAKE2b-256 a9cb890d7a45cc42f9f6c890d53216be7ba6f3a7d4d3d2811d7e2699f19a5e34

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 fb7930b6afacf9914afb8e5126d1a119ad4992d7937d51550edd7c52604eb3cc
MD5 be8d94f96d3db12faa6d65adc8aa272d
BLAKE2b-256 8ab086784720de0dc547ba2f7f00344afc01d3c8d49434cb58751ecc7da31751

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6a069ba4a8e83772d1b7075017abd25fbde47168d4cb4e454917f54ba307a841
MD5 b9b7ec6d1361372cfcd73d015777d7e3
BLAKE2b-256 65d39f2e6e6f6bdb9138ee6c2192a407c56341e370885fd38b4d47a037b70625

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e2d0c2f2d4273b40d7dd8bf0cf894d9c63f7c80e4f848b3aaf99594341e721b0
MD5 de794637c90e45e64417b8d7a8189f2b
BLAKE2b-256 f45437e77c26f99f3e72f5c723640a83712ee6eee843ca12398fce560a47b6b6

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1033fc8e8f220fea7562dada7ed5ea61b2dec9200484714e7eb9ba487f37d96f
MD5 1e4bc0c714355e9e84dfff771de89aa6
BLAKE2b-256 c2d3a75dd7945b84dbad3ae35ce4de4ef9114829b11d3c580429a013f303ca2c

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 0fdc9e9b0134c6b6a036bf7aa1dc45a94b619dc54a05e4b0f0f9aff692498320
MD5 99747a476a1c037a3616c5abb17d530c
BLAKE2b-256 b581b04317de79f297ca421536b7e62a8178a1cef4990d0082b7ea45cf095d65

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 da2e866b2e0db83bca973a0085605fdb9af02828c7675b483864a5104f79a810
MD5 d623d2d29df434cf3d06fa5a7b1501b3
BLAKE2b-256 a17a43d8a9e4ea2957d60bb02057a6bc425b480646f32c37f5a350ba2edfcaac

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a4c44ced7edd7b695d4c5c073fc6fb48b192f3e008394df35b25f7393585a28d
MD5 13ba773559b20841903932d5eec1a985
BLAKE2b-256 7d1ddc333e252746cf43dd42b3ae569db6177c217247938dd4289da8046058ea

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7bfa43f63c1f1bb53c4bf4ec46ff58749cee4c2ad1f738e46443f5ac5cf20683
MD5 d532ef4ff6ad174b21c3f94e94f8ac24
BLAKE2b-256 76a7def12f4aa75854f32e62fcd816732c55f75049a5b7f0f2ad1f98fe5220c8

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3767ebadcc30d34163117d88f0c9a8accde511ce992990bbc8b92180d9667062
MD5 30d6f1e5e901120f9710701e60c5f388
BLAKE2b-256 98c5381e4d2dcc9cac3c338ded498f4f3cd1143e18b3becb7c4653501392116f

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cd0fd3779127a56ce35387596bae118f0ff459c0cdfa68a8f3a8884c257c6ddd
MD5 8913cb63790ca95f7cec2f67a5a8db93
BLAKE2b-256 625bccace23d7d3d1b814a16c58e0774b702e7df0358c6f55f69b53299c9ac51

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 abc8f67542c1dc98275037c00627fc316a19ee94bd965f3875d7271bc0c9b0a3
MD5 b319c1ea2456a1f7d12f2e22edbb0186
BLAKE2b-256 91a8ef4465fb4b07f8c7cd52892a70db5d2d6f455fd0169c3bb6eaca72f78cde

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ec658878bd1dbe862161ec020056bd228849b38d74fabbffded0492d67c7daa1
MD5 4af7626c142934562ffad910eef1d298
BLAKE2b-256 4a73a4e975b9b2820315e1f7d04f2868b5fb9a487547d09455106dba7c3c0d66

See more details on using hashes here.

File details

Details for the file hannoy-0.0.6-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.6-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f3314efb7645117b2b4e0a162cff5eb80cda9fce045b48bd5255a837a68fc8b6
MD5 052fa52e35f64455c57c6b571eefbc97
BLAKE2b-256 1bbce8032b23d43df57c6e86683fdc9b4ffa3d9e6a54f95d937aebe8feb50f9e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page