Skip to main content

SAGE ANNS: Approximate Nearest Neighbor Search algorithms with unified Python interface

Project description

SAGE ANNS

Approximate Nearest Neighbor Search algorithms with unified Python interface

PyPI version Python 3.10+ License: MIT

Overview

isage-anns provides high-performance C++ implementations of state-of-the-art Approximate Nearest Neighbor Search (ANNS) algorithms with a unified Python interface. This package is part of the SAGE ecosystem.

Features

  • 🚀 High Performance: C++ implementations with pybind11 bindings
  • 🎯 Multiple Algorithms: FAISS, HNSW, DiskANN, CANDY, PUCK, SPTAG
  • 🔧 Unified Interface: Single API for all algorithms
  • 📦 Easy Installation: Pre-built wheels for major platforms
  • 🔌 Plug-and-Play: Works standalone or with SAGE framework

Supported Algorithms

Algorithm Type Index Features
FAISS Graph/IVF In-memory GPU support, multiple index types
VSAG HNSW Graph In-memory Fast search, high recall
GTI Graph+Tree In-memory Dynamic insertion/deletion, logarithmic complexity
PLSH Hash In-memory Parallel LSH, optimized for sparse vectors
DiskANN Graph Disk-based Large-scale datasets, memory efficient
CANDY Hybrid In-memory/Disk Optimized for diverse workloads
PUCK Graph In-memory Chinese-origin, high performance
SPTAG Tree/Graph In-memory Microsoft implementation

Installation

From PyPI (Recommended)

pip install isage-anns

From Source

# Clone the repository
git clone https://github.com/intellistream/sage-anns.git
cd sage-anns

# Install dependencies
pip install -r requirements.txt

# Build and install
pip install -e .

Requirements

  • Python >= 3.10
  • CMake >= 3.10
  • C++17 compiler (g++ or clang++)
  • System libraries:
    # Ubuntu/Debian
    sudo apt-get install build-essential cmake libopenblas-dev
    
    # macOS
    brew install cmake libomp
    

Quick Start

from sage_anns import ANNSIndex

# Create an index
index = ANNSIndex(
    algorithm="faiss_hnsw",
    dimension=128,
    metric="l2"
)

# Build index with data
import numpy as np
data = np.random.randn(10000, 128).astype('float32')
index.build(data)

# Search
query = np.random.randn(10, 128).astype('float32')
distances, indices = index.search(query, k=10)

print(f"Top-10 nearest neighbors: {indices}")
print(f"Distances: {distances}")

Usage Examples

FAISS HNSW

from sage_anns import ANNSIndex

index = ANNSIndex(
    algorithm="faiss_hnsw",
    dimension=128,
    metric="l2",
    M=32,  # HNSW parameter
    ef_construction=200
)
index.build(data)
index.search(query, k=10)

DiskANN

from sage_anns import ANNSIndex

index = ANNSIndex(
    algorithm="diskann",
    dimension=128,
    metric="l2",
    index_path="./diskann_index"  # Disk storage
)
index.build(data)
index.search(query, k=10)

VSAG HNSW

from sage_anns import ANNSIndex

index = ANNSIndex(
    algorithm="vsag_hnsw",
    dimension=128,
    metric="cosine",
    M=16,
    ef_construction=100
)
index.build(data)
index.search(query, k=10)

GTI (Graph-based Tree Index)

from sage_anns import ANNSIndex

index = ANNSIndex(
    algorithm="gti",
    dimension=128,
    metric="l2",
    m=16,  # Max graph connections per node
    L=100  # Search depth parameter
)
index.build(data)

# GTI supports efficient dynamic insertions and deletions
new_vectors = np.random.randn(100, 128).astype('float32')
index.add(new_vectors)

# Search after insertions
index.search(query, k=10)

PLSH (Parallel Locality-Sensitive Hashing)

from sage_anns import ANNSIndex

index = ANNSIndex(
    algorithm="plsh",
    dimension=128,
    metric="l2",
    k=10,  # Hash functions per table
    m=10,  # Number of hash tables
    num_threads=4
)
index.build(data)
index.search(query, k=10)

# PLSH is optimized for sparse vectors and high-dimensional data

API Reference

ANNSIndex

Parameters:

  • algorithm (str): Algorithm name (faiss_hnsw, diskann, vsag_hnsw, etc.)
  • dimension (int): Vector dimension
  • metric (str): Distance metric (l2, cosine, inner_product)
  • **kwargs: Algorithm-specific parameters

Methods:

  • build(data): Build index from numpy array
  • search(query, k): Search k nearest neighbors
  • add(vectors): Add vectors to index
  • save(path): Save index to disk
  • load(path): Load index from disk

Integration with SAGE

This package is designed to work seamlessly with the SAGE framework:

from sage.libs.anns import create_index

# SAGE will automatically use isage-anns if installed
index = create_index("faiss_hnsw", dimension=128)
index.build(data)

Development

Building from Source

# Clone with submodules (contains third-party libraries)
git clone --recursive https://github.com/intellistream/sage-anns.git
cd sage-anns

# Build all algorithms
./build_all.sh

# Or build specific algorithm
cd implementations/<algorithm>
mkdir build && cd build
cmake .. && make -j$(nproc)

Running Tests

pip install pytest
pytest tests/

Performance

Benchmarks on 1M SIFT vectors (128-dim):

Algorithm Build Time Query Time (10-NN) Recall@10
FAISS HNSW 45s 0.8ms 0.95
VSAG HNSW 42s 0.9ms 0.94
DiskANN 120s 1.2ms 0.93
CANDY 50s 1.0ms 0.92

Benchmarks run on Intel Xeon Silver 4214R @ 2.40GHz

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Code Structure

sage-anns/
├── implementations/      # C++ source code
│   ├── faiss/
│   ├── diskann-ms/
│   ├── candy/
│   └── ...
├── python/              # Python bindings
│   └── sage_anns/
├── tests/               # Unit tests
├── CMakeLists.txt       # Build configuration
└── pyproject.toml       # Package metadata

License

MIT License - see LICENSE for details.

Citation

If you use this package in your research, please cite:

@software{sage_anns,
  title = {SAGE ANNS: Approximate Nearest Neighbor Search},
  author = {IntelliStream Team},
  year = {2026},
  url = {https://github.com/intellistream/sage-anns}
}

Acknowledgements

This package integrates implementations from:

  • FAISS by Meta Research
  • DiskANN by Microsoft Research
  • SPTAG by Microsoft
  • PUCK by ByteDance
  • CANDY by IntelliStream Team

Related Projects

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

isage_anns-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.8 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

isage_anns-0.1.3-cp310-cp310-manylinux_2_34_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file isage_anns-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for isage_anns-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a07117927fc9a0bad27591a24f842a900838c1a35d1bb694092dab34fdb0f319
MD5 7398b560dc24fc93a1173fb1f1bac231
BLAKE2b-256 cb5b8282b7bfe2e786e3c9c7290c89a37853e580c9bd22adfd5d161447581ac2

See more details on using hashes here.

File details

Details for the file isage_anns-0.1.3-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for isage_anns-0.1.3-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 83625bbd5dd9713a7183dcc9ca71b7f974c35ebb1079c1a99846baf730b85ce1
MD5 fdb66fc3c05b8211eda8fa7034bf8e1f
BLAKE2b-256 2709df22776478ca817fb2807f681aaba527749066a07cee4680ba14ebd8360f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page