Skip to main content

HNSW Approximate Nearest Neighbors in Rust, based on LMDB and optimized for memory usage

Project description

hannoy 🗼

License Crates.io dependency status Build CodSpeed Badge

hannoy is a key-value backed HNSW implementation based on arroy.

Motivation

Many popular HNSW libraries are built in memory, meaning you need enough RAM to store all the vectors you're indexing. Instead, hannoy uses LMDB — a memory-mapped KV store — as a storage backend. This is more well-suited for machines running multiple programs, or cases where the dataset you're indexing won't fit in memory. LMDB also supports non-blocking concurrent reads by design, meaning its safe to query the index in multi-threaded environments.

Features

  • Supported metrics: euclidean, cosine, manhattan, hamming, as well as quantized counterparts.
  • Python bindings with maturin and pyo3
  • Multithreaded builds using rayon
  • Disk-backed storage to enable indexing datasets that won't fit in RAM using LMDB
  • Compressed bitmaps to store graph edges with minimal overhead, adding ~200 bytes per vector
  • Dynamic document insertions and deletions without full re-indexing

Missing Features

  • GPU-accelerated indexing

Usage

Rust 🦀

use hannoy::{distances::Cosine, Database, Reader, Result, Writer};
use heed::EnvOpenOptions;
use rand::{rngs::StdRng, SeedableRng};

fn main() -> Result<()> {
    const DIM: usize = 3;
    let vecs: Vec<[f32; DIM]> = vec![[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]];

    let env = unsafe {
        EnvOpenOptions::new()
            .map_size(1024 * 1024 * 1024 * 1) // 1GiB
            .open("./")
    }
    .unwrap();

    let mut wtxn = env.write_txn().unwrap();
    let db: Database<Cosine> = env.create_database(&mut wtxn, None)?;
    let writer: Writer<Cosine> = Writer::new(db, 0, DIM);

    // insert into lmdb
    writer.add_item(&mut wtxn, 0, &vecs[0])?;
    writer.add_item(&mut wtxn, 1, &vecs[1])?;
    writer.add_item(&mut wtxn, 2, &vecs[2])?;

    // ...and build hnsw
    let mut rng = StdRng::seed_from_u64(42);

    let mut builder = writer.builder(&mut rng);
    builder.ef_construction(100).build::<16,32>(&mut wtxn)?;
    wtxn.commit()?;

    // search hnsw using a new lmdb read transaction
    let rtxn = env.read_txn()?;
    let reader = Reader::<Cosine>::open(&rtxn, 0, db)?;

    let query = vec![0.0, 1.0, 0.0];
    let nns = reader.nns(1).ef_search(10).by_vector(&rtxn, &query)?;

    dbg!("{:?}", &nns);
    Ok(())
}

Python 🐍

import hannoy
from hannoy import Metric
import tempfile

tmp_dir = tempfile.gettempdir()
db = hannoy.Database(tmp_dir, Metric.COSINE)

with db.writer(3, m=4, ef=10) as writer:
    writer.add_item(0, [1.0, 0.0, 0.0])
    writer.add_item(1, [0.0, 1.0, 0.0])
    writer.add_item(2, [0.0, 0.0, 1.0])

reader = db.reader()
nns = reader.by_vec([0.0, 1.0, 0.0], n=2)

(closest, dist) = nns[0]

Tips and tricks

Reducing cold start latencies

Search in an hnsw always traverses from the top to bottom layers of the graph, so we know a priori some vectors will be needed. We can hint to the kernel that these vectors (and their neighbours) should be loaded into RAM using madvise to speed up search.

Doing so can reduce cold-start latencies by several milliseconds, and is configured through the HANNOY_READER_PREFETCH_MEMORY environment variable.

E.g. prefetching 10MiB of vectors into RAM.

export HANNOY_READER_PREFETCH_MEMORY=10485760

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hannoy-0.0.8.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hannoy-0.0.8-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

hannoy-0.0.8-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

hannoy-0.0.8-cp313-cp313-macosx_11_0_arm64.whl (988.8 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

hannoy-0.0.8-cp313-cp313-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

hannoy-0.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

hannoy-0.0.8-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

hannoy-0.0.8-cp312-cp312-macosx_11_0_arm64.whl (988.7 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

hannoy-0.0.8-cp312-cp312-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

hannoy-0.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

hannoy-0.0.8-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

hannoy-0.0.8-cp311-cp311-macosx_11_0_arm64.whl (991.4 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

hannoy-0.0.8-cp311-cp311-macosx_10_12_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

hannoy-0.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

hannoy-0.0.8-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

hannoy-0.0.8-cp310-cp310-macosx_11_0_arm64.whl (991.9 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

hannoy-0.0.8-cp310-cp310-macosx_10_12_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file hannoy-0.0.8.tar.gz.

File metadata

  • Download URL: hannoy-0.0.8.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.4

File hashes

Hashes for hannoy-0.0.8.tar.gz
Algorithm Hash digest
SHA256 56d2224fb0820b6b6bf072698238787d4f72d467cbc41240a0ad6539bdeb9eda
MD5 606898e803a649122642d382f22b0e40
BLAKE2b-256 371f8f4311fc8f65354e567fdfcdb11380372e98daf6095b4d3a444b97ac7415

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 77424154731eed2b38db0183f24d239482257902ba28b4bcfa6f0f45566fb3e2
MD5 bc15c62f8ac470fd2caf4de55247e841
BLAKE2b-256 03b968277daf64825361b68b568cb9cd930e253903968e1fc27854d84e1f7b7f

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9dca8539feb96cd43884d60cf8ca831d5a75b277fcc6dacdf5d495b5cf71c4f8
MD5 257950d8725696cfb55b763103f8ff1d
BLAKE2b-256 32f8b3de2144c9ff76eb775192f31d5e417cbb5b886a2032f8c8b256d47e7bdc

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 51c89a111e03a8b1e3e44890d0e1ef7e0e498414edc42bedc1dfef7f11cb17df
MD5 b2acadbcb255abbe174e27fb36d6fb5b
BLAKE2b-256 84576b4c7af6dbe508278707a70e5ef973c0672171a6c6abf9ccbcda3a73daed

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a9b18d0955e4c89c1b07b3a1d5feefbeeac84e29cbc2f5bac586d9f231ec8309
MD5 2a7aeb2a19e0865742dcb2c83dd23fb0
BLAKE2b-256 ee4d6e5a969dead5e8351aa0007af897f4d3a88e084e58a87bfe84bfc0e4de20

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c5889366255cb04e075bfda71f29cf695e6f3d310896e28beb2208d1b567c3fb
MD5 476f51e1875ccc712f3cbfb0316a193c
BLAKE2b-256 d164eed46d768d25553f023b7ea960178451d76114c4c6ffce4f3968620ff54c

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7977fda9d1f9671f95e163fef592463d9898b6760f963e1fa270e75529d60314
MD5 be56d911dd92be6afc1120866ac2359e
BLAKE2b-256 95eea2e24d39486d273c1e7704629d1a711b9af4e5c59e9c4c5d9242a671f65b

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 952e69b022e390ef2257a239ecf76b9f44b8a7f70b555bb4918662b7f95c2262
MD5 c4758fd422fbc4217fe4eb4b54b9c829
BLAKE2b-256 15afd6528160b595c04e6a008b92152b0a1e9a09d7d5c8e456af6043ea80fbd5

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 1f459daaad6bfc63aa28d708acc0ae15f24cdfa36f58ce91dbafb0cf4aa6effa
MD5 d22571561d3517fb16f4619c81c5a6de
BLAKE2b-256 a62f3795f9a8941b074d24ea4febf28439e4a17bceb19d8a97efd453bba50373

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6a27b8ecdf5fa1c35c21b8d8f3f7c4196138d7fe3b7efdbf38fb8b2ed65e3a0a
MD5 9c2c9af5539b45516650f184b565fff5
BLAKE2b-256 fe17b68ef1295a2b801968d4f496786e38405cf8fe26311c2627b1df73c156f1

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 666088d5e7319331753eb11684c76be0a3097d15b860db2fb65ca5c1854a9942
MD5 a629624d63387b2486a9e8a6ee2c9eee
BLAKE2b-256 3b157eaf75f73ceec8179f189e09126f04c9fb3f4fe3eb8cb4468a5d376e15f1

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3a7bf932fcd5596031d1c3c8c6c4a67b007aeaa5e654bd74660ade164df1c109
MD5 7d1846cea2bd70412ce0f3f95bfaa7ff
BLAKE2b-256 571bcd82a63246d18c51c3407dde80924d8de62bd6d9ef52e8ddf902b7726cde

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 da2c4f9dd00e9a5ab0b393a62945c250fed1e147284dcc24353dc9f91fc4de38
MD5 7c148e7cda5e8e7e3862f0f78b20b49b
BLAKE2b-256 5e9b0b59a997ca73b8d0de9d6d8380a7f74c7121ecda0453b810e9198f763d30

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 055abea10f6ad455bd20988c23a0679d7c9efb2a8a4a9a0516132cb6b0c59099
MD5 0057d86d43d9a20a34684a444c584f72
BLAKE2b-256 f294daa93dae474072d2e16fa108200d0157ca28dee6937ab0c8fe6ff6fb1bb5

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 bc2a58c63121a714affc446698b6590e9e71cfa14ee1960e94f3801c812370c3
MD5 0cd95bb19669ff027d27fc5aba6dc534
BLAKE2b-256 b48ef109ee1b760f1ce7aa4fc431cbb0c0a8aa68fc9e136125e910f1f33b92d3

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7f3d7eada61f1921f3d03ac14660dab8ebc5beab4343adb9b8c68bd350ed1a62
MD5 d7e72edce769dce8a213dc1efaf22b61
BLAKE2b-256 103ead274e0c92bcc02735f46963b400943d31a3f4fcb4665ecb31b0d1ea7717

See more details on using hashes here.

File details

Details for the file hannoy-0.0.8-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for hannoy-0.0.8-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f2774a288a148fb682bbd0d277d4eafeb113dd086829a6cd59c5b3b3178db3af
MD5 886415acb8bcaecb6371371d0067d778
BLAKE2b-256 280905c690c1785299fe75c3087b5ac8a0912d2c639efa22383d52df1e954b65

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page