Skip to main content

Python bindings for Dynamic Learned Index - efficient approximate nearest neighbor search using learned models

Project description

py-dynamic-learned-index

Python bindings for Dynamic Learned Index - a high-performance approximate nearest neighbor search library that uses learned neural network models to efficiently index and query unstructured data.

Overview

The Dynamic Learned Index uses a multi-level hierarchical structure where each level contains a neural network model trained to predict the most likely bucket containing the requested data. This learned approach significantly improves search performance compared to traditional indexing methods.

Installation

Install from PyPI:

pip install py-dynamic-learned-index

Quick Start

from py_dynamic_learned_index import Index

# Create an index
index = Index(
    input_shape=768,           # Dimension of your vectors
    buffer_size=5000,          # Write buffer size
    bucket_size=5000,          # Max records per bucket
    arity=3,                   # Fanout for compaction
    device="cpu"               # Use "cpu" or "cuda"
)

# Insert vectors with IDs
vector = [0.1, 0.2, 0.3, ...]  # 768-dimensional vector
index.insert(vector, id=1)

# Search for k nearest neighbors
query = [0.15, 0.25, 0.35, ...]
results = index.search(query, k=10)  # Returns top-10 nearest neighbors

# Delete an entry
index.delete(id=1)

Configuration

The index supports various configuration options to optimize for your use case:

  • input_shape: Vector dimension (must match your data)
  • buffer_size: Size of write buffer before flushing to index
  • bucket_size: Maximum records per bucket before splitting
  • arity: Fanout for tree compaction
  • device: Compute device ("cpu" or "cuda")
  • distance_fn: Distance metric ("dot" for cosine similarity, "l2" for euclidean)

For more details on configuration, see the main project README.

Features

  • Learned Indexing: Uses neural networks to guide search, not just data distribution
  • Dynamic: Supports online insertions and deletions
  • Multi-level: Hierarchical structure for better scalability
  • Fast: Optimized Rust implementation with Python bindings
  • Flexible: Configurable models, distance functions, and compaction strategies

Examples

Full example with data loading and querying:

import numpy as np
from py_dynamic_learned_index import Index

# Create index for 768-dimensional vectors (e.g., embeddings)
index = Index(input_shape=768, buffer_size=5000, bucket_size=5000)

# Generate sample data
n_vectors = 100000
vectors = np.random.randn(n_vectors, 768).astype(np.float32)

# Insert vectors
for i, vector in enumerate(vectors):
    index.insert(vector, id=i)

# Search
query = vectors[0]  # Use first vector as query
neighbors = index.search(query, k=10)
print(f"Top 10 neighbors: {neighbors}")

For more examples, check the example.py file in the repository.

Performance

The Dynamic Learned Index is optimized for:

  • High-dimensional vector search (e.g., embeddings from language/vision models)
  • Large-scale datasets (millions of vectors)
  • Both batch and online query scenarios

Performance depends on your data distribution, vector dimensionality, and hardware configuration.

Heavy Users: Building from Source

Simple Development Build

For a quick development build:

git clone https://github.com/plhis/DynamicLearnedIndex.git
cd DynamicLearnedIndex/py_dynamic_learned_index
pip install maturin
maturin develop --release

Sdist Installation with Custom Features

For advanced users building from source distribution with specific features:

# Set up environment
export CARGO_TARGET_DIR=$SCRATCHDIR/cargo-target
mkdir -p $CARGO_TARGET_DIR

# Install Rust (if not already installed)
if ! command -v rustup &> /dev/null; then
    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
    source $HOME/.cargo/env
fi

# Install and set Rust nightly
rustup install nightly
rustup default nightly

# Install build dependencies
pip install maturin meson-python meson ninja cython

# Set compiler flags (optional)
export CC=gcc
export CXX=g++
export UV_LINK_MODE=copy

# Build with custom features (e.g., measure_time, mix)
export MATURIN_BUILD_ARGS="--features measure_time,mix"

# Install from source distribution without binary wheels
pip install --force-reinstall --no-binary :all: py-dynamic-learned-index --no-build-isolation

# Verify installation
python -c "import py_dynamic_learned_index; print(py_dynamic_learned_index.__version__)"

Note: Set MATURIN_BUILD_ARGS with the features you need (e.g., measure_time, mix, tch, candle, mkl).

Requirements

  • Python >= 3.9
  • For GPU support (cuda feature): NVIDIA GPU and CUDA toolkit

Documentation

For detailed configuration and usage documentation, see the main project repository.

License

Licensed under the GNU Lesser General Public License v3.0 or later (LGPL-3.0-or-later).

See the LICENSE file in the main repository for full license text.

Contributing

Contributions are welcome! Please see the main repository for contribution guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_dynamic_learned_index-0.1.12.tar.gz (121.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

py_dynamic_learned_index-0.1.12-cp312-cp312-manylinux_2_39_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.39+ x86-64

py_dynamic_learned_index-0.1.12-cp312-cp312-manylinux_2_39_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.39+ ARM64

py_dynamic_learned_index-0.1.12-cp312-cp312-macosx_11_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

py_dynamic_learned_index-0.1.12-cp312-cp312-macosx_10_12_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

File details

Details for the file py_dynamic_learned_index-0.1.12.tar.gz.

File metadata

File hashes

Hashes for py_dynamic_learned_index-0.1.12.tar.gz
Algorithm Hash digest
SHA256 3130cdd610c4e97546befa5ff059c635ecd661beb52ea52e4eef79ce906fba93
MD5 95f583ef511d134318df0b7d74f18000
BLAKE2b-256 91dab8246433fcbb64e4f5067467ba3311e5d14200d9a88e43d772de14316117

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dynamic_learned_index-0.1.12.tar.gz:

Publisher: publish-pypi.yml on simonplhak/DynamicLearnedIndex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file py_dynamic_learned_index-0.1.12-cp312-cp312-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for py_dynamic_learned_index-0.1.12-cp312-cp312-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 b1c070bbb19497b5c89dc67333e3f971afb0e7dd5ddb3dd71f73291edc5fa376
MD5 781ba822e344c9970b95827259eff01b
BLAKE2b-256 269a2ca4cc7eef87d05fac9ebe3e2294715f57381166bec0d1947903144fc0fb

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dynamic_learned_index-0.1.12-cp312-cp312-manylinux_2_39_x86_64.whl:

Publisher: publish-pypi.yml on simonplhak/DynamicLearnedIndex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file py_dynamic_learned_index-0.1.12-cp312-cp312-manylinux_2_39_aarch64.whl.

File metadata

File hashes

Hashes for py_dynamic_learned_index-0.1.12-cp312-cp312-manylinux_2_39_aarch64.whl
Algorithm Hash digest
SHA256 5c472d657f8917b10c13cc4814184fafb029435d3cd2b49adbfa40fb5981c3e1
MD5 53fc27b8c430a6892af2d46f0aff574a
BLAKE2b-256 5e8f19974c73b106f3d70142e465000c1723422cbd6181f4a697bacd56eb0f9a

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dynamic_learned_index-0.1.12-cp312-cp312-manylinux_2_39_aarch64.whl:

Publisher: publish-pypi.yml on simonplhak/DynamicLearnedIndex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file py_dynamic_learned_index-0.1.12-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for py_dynamic_learned_index-0.1.12-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 89dc88d0ad1729db0c5603a5f030bf2bb68251d01e1ffe6bbdf618fd4e8edc28
MD5 f12cad4b5c3ac1182fb1a5f4e104a0a9
BLAKE2b-256 54d25a3fce8e92a7ecf63bcef4390a70c27d0d02e786ebdbb4feb1527d6e67fb

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dynamic_learned_index-0.1.12-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: publish-pypi.yml on simonplhak/DynamicLearnedIndex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file py_dynamic_learned_index-0.1.12-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for py_dynamic_learned_index-0.1.12-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 72daba5659f8330715a644c447dffc62d56e04aaa81951bdcfd6df8755c0615e
MD5 a4623b6185f56d15828651d403e760b2
BLAKE2b-256 6b3e1dd70bb8de64183751cbdf40f74ead4b7f5fb9ce4f5a15500ae46cd3ba40

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dynamic_learned_index-0.1.12-cp312-cp312-macosx_10_12_x86_64.whl:

Publisher: publish-pypi.yml on simonplhak/DynamicLearnedIndex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page