Skip to main content

Python bindings for Dynamic Learned Index - efficient approximate nearest neighbor search using learned models

Project description

py-dynamic-learned-index

Python bindings for Dynamic Learned Index - a high-performance approximate nearest neighbor search library that uses learned neural network models to efficiently index and query unstructured data.

Overview

The Dynamic Learned Index uses a multi-level hierarchical structure where each level contains a neural network model trained to predict the most likely bucket containing the requested data. This learned approach significantly improves search performance compared to traditional indexing methods.

Installation

Install from PyPI:

pip install py-dynamic-learned-index

Quick Start

from py_dynamic_learned_index import Index

# Create an index
index = Index(
    input_shape=768,           # Dimension of your vectors
    buffer_size=5000,          # Write buffer size
    bucket_size=5000,          # Max records per bucket
    arity=3,                   # Fanout for compaction
    device="cpu"               # Use "cpu" or "cuda"
)

# Insert vectors with IDs
vector = [0.1, 0.2, 0.3, ...]  # 768-dimensional vector
index.insert(vector, id=1)

# Search for k nearest neighbors
query = [0.15, 0.25, 0.35, ...]
results = index.search(query, k=10)  # Returns top-10 nearest neighbors

# Delete an entry
index.delete(id=1)

Configuration

The index supports various configuration options to optimize for your use case:

  • input_shape: Vector dimension (must match your data)
  • buffer_size: Size of write buffer before flushing to index
  • bucket_size: Maximum records per bucket before splitting
  • arity: Fanout for tree compaction
  • device: Compute device ("cpu" or "cuda")
  • distance_fn: Distance metric ("dot" for cosine similarity, "l2" for euclidean)

For more details on configuration, see the main project README.

Features

  • Learned Indexing: Uses neural networks to guide search, not just data distribution
  • Dynamic: Supports online insertions and deletions
  • Multi-level: Hierarchical structure for better scalability
  • Fast: Optimized Rust implementation with Python bindings
  • Flexible: Configurable models, distance functions, and compaction strategies

Examples

Full example with data loading and querying:

import numpy as np
from py_dynamic_learned_index import Index

# Create index for 768-dimensional vectors (e.g., embeddings)
index = Index(input_shape=768, buffer_size=5000, bucket_size=5000)

# Generate sample data
n_vectors = 100000
vectors = np.random.randn(n_vectors, 768).astype(np.float32)

# Insert vectors
for i, vector in enumerate(vectors):
    index.insert(vector, id=i)

# Search
query = vectors[0]  # Use first vector as query
neighbors = index.search(query, k=10)
print(f"Top 10 neighbors: {neighbors}")

For more examples, check the example.py file in the repository.

Performance

The Dynamic Learned Index is optimized for:

  • High-dimensional vector search (e.g., embeddings from language/vision models)
  • Large-scale datasets (millions of vectors)
  • Both batch and online query scenarios

Performance depends on your data distribution, vector dimensionality, and hardware configuration.

Heavy Users: Building from Source

Simple Development Build

For a quick development build:

git clone https://github.com/plhis/DynamicLearnedIndex.git
cd DynamicLearnedIndex/py_dynamic_learned_index
pip install maturin
maturin develop --release

Sdist Installation with Custom Features

For advanced users building from source distribution with specific features:

# Set up environment
export CARGO_TARGET_DIR=$SCRATCHDIR/cargo-target
mkdir -p $CARGO_TARGET_DIR

# Install Rust (if not already installed)
if ! command -v rustup &> /dev/null; then
    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
    source $HOME/.cargo/env
fi

# Install and set Rust nightly
rustup install nightly
rustup default nightly

# Install build dependencies
pip install maturin meson-python meson ninja cython

# Set compiler flags (optional)
export CC=gcc
export CXX=g++
export UV_LINK_MODE=copy

# Build with custom features (e.g., measure_time, mix)
export MATURIN_BUILD_ARGS="--features measure_time,mix"

# Install from source distribution without binary wheels
pip install --force-reinstall --no-binary :all: py-dynamic-learned-index --no-build-isolation

# Verify installation
python -c "import py_dynamic_learned_index; print(py_dynamic_learned_index.__version__)"

Note: Set MATURIN_BUILD_ARGS with the features you need (e.g., measure_time, mix, tch, candle, mkl).

Requirements

  • Python >= 3.9
  • For GPU support (cuda feature): NVIDIA GPU and CUDA toolkit

Documentation

For detailed configuration and usage documentation, see the main project repository.

License

Licensed under the GNU Lesser General Public License v3.0 or later (LGPL-3.0-or-later).

See the LICENSE file in the main repository for full license text.

Contributing

Contributions are welcome! Please see the main repository for contribution guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_dynamic_learned_index-0.1.11.tar.gz (121.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

py_dynamic_learned_index-0.1.11-cp312-cp312-manylinux_2_39_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.39+ x86-64

py_dynamic_learned_index-0.1.11-cp312-cp312-manylinux_2_39_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.39+ ARM64

py_dynamic_learned_index-0.1.11-cp312-cp312-macosx_11_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

py_dynamic_learned_index-0.1.11-cp312-cp312-macosx_10_12_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

File details

Details for the file py_dynamic_learned_index-0.1.11.tar.gz.

File metadata

File hashes

Hashes for py_dynamic_learned_index-0.1.11.tar.gz
Algorithm Hash digest
SHA256 f4e7691d1fde53307a20e4d05f57527a0648b3cb84a5185a5c62f28b2c050f88
MD5 91c3c79b08985e6dc36711476f6038bc
BLAKE2b-256 d06ab81c4ec1be40d8e14349ea2fb33017ff043a3bfe2e71ca00828bc550f7fa

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dynamic_learned_index-0.1.11.tar.gz:

Publisher: publish-pypi.yml on simonplhak/DynamicLearnedIndex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file py_dynamic_learned_index-0.1.11-cp312-cp312-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for py_dynamic_learned_index-0.1.11-cp312-cp312-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 993a253c2735d0c3f25d72159d10710735d0ab010301106f0bdbac48407194e8
MD5 18e573a6a950e84e59c8ef654a325524
BLAKE2b-256 6626d75d277ba4cc53ac7899f25ce5a7ed3a947f9fd773b3b0f804b54487fa41

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dynamic_learned_index-0.1.11-cp312-cp312-manylinux_2_39_x86_64.whl:

Publisher: publish-pypi.yml on simonplhak/DynamicLearnedIndex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file py_dynamic_learned_index-0.1.11-cp312-cp312-manylinux_2_39_aarch64.whl.

File metadata

File hashes

Hashes for py_dynamic_learned_index-0.1.11-cp312-cp312-manylinux_2_39_aarch64.whl
Algorithm Hash digest
SHA256 868d8ba0d3a67a74849dd8fefd17b7a2a8d54ff80368177a9f6ef7407ca216e6
MD5 a3579c6899852a5270b6f1f5ac7634fc
BLAKE2b-256 df39b2511562c7394c65b1f44a1fc1958a07da7f2a2838fdb3eeb14957f0f666

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dynamic_learned_index-0.1.11-cp312-cp312-manylinux_2_39_aarch64.whl:

Publisher: publish-pypi.yml on simonplhak/DynamicLearnedIndex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file py_dynamic_learned_index-0.1.11-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for py_dynamic_learned_index-0.1.11-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 07ad6cb8287b227e78d8c016f0f2273b656f7c10956e8f87d2db71f9cdb456f9
MD5 6374eed64feae334e468df6f441557c4
BLAKE2b-256 b37cb78be741561e5bec5550e2e23b3a4d17b10c71fdda822a3d25ebdfbdf36b

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dynamic_learned_index-0.1.11-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: publish-pypi.yml on simonplhak/DynamicLearnedIndex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file py_dynamic_learned_index-0.1.11-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for py_dynamic_learned_index-0.1.11-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 971723641ce97a981fa677a287440a8690de1063bb5be21b95c09968e016060f
MD5 53b8e6d8824210a3faddf49cf2fece07
BLAKE2b-256 0b2e6aa0a3eac593e9e1fbcfaa49c0230bf3f9e2734b9428dc799f9d1fc8802e

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dynamic_learned_index-0.1.11-cp312-cp312-macosx_10_12_x86_64.whl:

Publisher: publish-pypi.yml on simonplhak/DynamicLearnedIndex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page