Skip to main content

High-performance exact vector similarity search with Rust backend

Project description

NSeekFS

PyPI version Python Version License: MIT

High-Performance Exact Vector Search with Rust Backend

Fast and exact cosine similarity search for Python. Built with Rust for performance, designed for production use.

pip install nseekfs

Quick Start

import nseekfs
import numpy as np

# Create some test vectors
embeddings = np.random.randn(10000, 384).astype(np.float32)
query = np.random.randn(384).astype(np.float32)

# Build index and run a search
index = nseekfs.from_embeddings(embeddings, normalized=True)
results = index.query(query, top_k=10)

print(f"Found {len(results)} results")
print(f"Best match: idx={results[0]['idx']} score={results[0]['score']:.3f}")

Core Features

Exact Search

# Basic query
results = index.query(query, top_k=10)

# Access results
for item in results:
    print(f"Vector {item['idx']}: {item['score']:.6f}")

Batch Queries

queries = np.random.randn(50, 384).astype(np.float32)
batch_results = index.query_batch(queries, top_k=5)
print(f"Processed {len(batch_results)} queries")

Query Options

# Simple query (alias for query with format="simple")
results = index.query_simple(query, top_k=10)

# Detailed query with timing and diagnostics
result = index.query_detailed(query, top_k=10)
print(f"Query took {result.query_time_ms:.2f} ms, top1 idx={result.results[0]['idx']}")

Index Persistence

# Load an existing index
index = nseekfs.from_bin("my_vectors.bin")
print(f"Loaded index: {index.rows} vectors x {index.dims} dims")

Performance Metrics

metrics = index.get_performance_metrics()
print(f"Total queries: {metrics['total_queries']}")
print(f"Average time: {metrics['avg_query_time_ms']:.2f} ms")

Built-in Benchmark

nseekfs.benchmark(vectors=1000, dims=384, queries=100, verbose=True)

API Reference

Index

  • from_embeddings(embeddings, normalized=True, verbose=False)
  • from_bin(path)

Queries

  • query(query_vector, top_k=10)
  • query_simple(query_vector, top_k=10)
  • query_detailed(query_vector, top_k=10)
  • query_batch(queries, top_k=10)

Properties

  • index.rows
  • index.dims
  • index.config

Utilities

  • get_performance_metrics()
  • benchmark(vectors=..., dims=..., queries=...)

Architecture Highlights

SIMD Optimizations

  • AVX2 support for 8x parallelism on compatible CPUs
  • Automatic fallback to scalar operations on older hardware
  • Runtime detection of CPU capabilities

Memory Management

  • Memory mapping for efficient data access
  • Thread-local buffers for zero-allocation queries
  • Cache-aligned data structures for optimal performance

Batch Processing

  • Intelligent batching strategies based on query size
  • SIMD vectorization across multiple queries
  • Optimized memory access patterns

Installation

# From PyPI
pip install nseekfs

# Verify installation
python -c "import nseekfs; print('NSeekFS installed successfully')"

Technical Details

  • Precision: Float32 optimized for standard ML embeddings
  • Memory: Efficient memory usage with optimized data structures
  • Performance: Rust backend with SIMD optimizations where available
  • Compatibility: Python 3.8+ on Windows, macOS, and Linux
  • Thread Safety: Safe concurrent access from multiple threads

Performance Tips

# Pre-normalize vectors if using cosine similarity
embeddings = embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True)
index = nseekfs.from_embeddings(embeddings, normalized=False)

# Use appropriate data types
embeddings = embeddings.astype(np.float32)

# Choose optimal top_k values
results = index.query(query, top_k=10)  # vs top_k=1000

# Use batch processing for multiple queries
batch_results = index.query_batch(queries, top_k=10)

License

MIT License - see LICENSE file for details.


Fast, exact cosine similarity search for Python.

Built with Rust for performance, designed for Python developers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nseekfs-1.0.1-cp38-abi3-win_amd64.whl (236.4 kB view details)

Uploaded CPython 3.8+Windows x86-64

nseekfs-1.0.1-cp38-abi3-manylinux_2_34_x86_64.whl (356.2 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.34+ x86-64

nseekfs-1.0.1-cp38-abi3-macosx_11_0_arm64.whl (281.1 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

nseekfs-1.0.1-cp38-abi3-macosx_10_12_x86_64.whl (306.6 kB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file nseekfs-1.0.1-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: nseekfs-1.0.1-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 236.4 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nseekfs-1.0.1-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 b1deb1f791b3e932c38b5f3331c4a7e4ffeca3b06614a2bcdc89b5902076ea72
MD5 d2e4f9fdb581d61b5452b62ccc689d11
BLAKE2b-256 5d382a94bc98ed9edb7d2923d0326f759400024a6db897ff9a498bdd0b7da2a0

See more details on using hashes here.

File details

Details for the file nseekfs-1.0.1-cp38-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for nseekfs-1.0.1-cp38-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4837a9e44283c59fe4f6a94ec3729fc3f5ff4b7bcc708b90846d71cb7c306757
MD5 f2a1360551cf4dbc1a70ad1c3d310166
BLAKE2b-256 ef75779327d9698b4a8af746129cfd21b2c6c230c323347952da9e408d5d4c48

See more details on using hashes here.

File details

Details for the file nseekfs-1.0.1-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for nseekfs-1.0.1-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4411f0ae7a1be32bd6b21e76a7338ad08010579e025469f22decbf32aab44d40
MD5 05248f4ec0abf3a6a232078f9f43a8a7
BLAKE2b-256 33563d586b0e4fa5db18074ed96713b5a2afcd86fef282454ac4d700206876d0

See more details on using hashes here.

File details

Details for the file nseekfs-1.0.1-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for nseekfs-1.0.1-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 17518d468abb36063915fae035dbfe0b38a0f6d76eaa723beadc4b9b36220a7e
MD5 7bf1d2a75ff7c1f73aba191eac3e18b0
BLAKE2b-256 54efc0812552c0647d807e670035704deebc481c14a419f882794ffb924b3f49

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page