Skip to main content

TurboQuant (ICLR 2026) — SIMD-accelerated 4/8-bit quantization Space for ANN

Project description

turboquant-space

This library was inspired by the article https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/. The library is optimized for efficient data allocation in memory for 3+1 and 7+1 bit quantization schemes.

SIMD-accelerated 4/8-bit vector quantization for approximate nearest neighbor search, based on TurboQuant (ICLR 2026). Standalone C++17 library with Python bindings.

from turboquant import TurboQuantSpace
import numpy as np

space = TurboQuantSpace(dim=128, bits_per_coord=8, num_threads=4)
X = np.random.randn(100_000, 128).astype(np.float32)
q = np.random.randn(128).astype(np.float32)

codes = space.encode_batch(X)              # (100_000, code_size) uint8
dists = space.distance_1_to_n(q, codes)    # (100_000,) float32

That is the whole mental model: encode once, then distance_* against the codes. No index to build, no state to persist beyond codes.


What it does, briefly

TurboQuant encodes each float32 vector into a compact code of bits_per_coord bits per coordinate using a randomized Walsh–Hadamard rotation followed by Lloyd–Max scalar quantization, plus one QJL sign bit per coordinate for an unbiased residual correction. Distances between a raw query and a packed code (asymmetric) or between two packed codes (symmetric) are computed directly on the quantized representation with hand-written NEON / SSE / AVX kernels.

Concretely you get:

bits_per_coord layout bytes / vec (dim=128) compression vs fp32
4 nibble-packed 76 6.7×
8 one byte per coord 140 3.7×

(Plus 12 bytes of metadata — norm, γ, σ — per code.)

What it is not: not a graph index, not an IVF, not a drop-in replacement for FAISS. It is the distance layer. Plug it into your own index, or use distance_1_to_n as brute-force search on batches up to a few million.


Install

pip install turboquant-space

Prebuilt wheels are published for CPython 3.11–3.13 on Linux (x86_64, aarch64), macOS (x86_64, arm64), and Windows (AMD64). They target a conservative CPU baseline — x86-64-v3 (AVX2 + FMA + BMI2) on x64 and armv8-a (NEON) on arm64 — so a single wheel runs on anything produced in the last ~8 years. A C++ compiler is not required for this path.

Build from source for maximum performance

The prebuilt wheels trade a few percent for portability. If you have a C++ compiler and want the binary tuned to your CPU (AVX-512 on Zen4 / Ice Lake, SVE on Graviton, etc.), force pip to skip the wheel and compile from sdist:

pip install turboquant-space --no-binary turboquant-space

This invokes CMake with -march=native, so every available instruction set on the build machine is enabled. Requires CMake ≥ 3.18 and a C++17 compiler; on macOS also brew install libomp for multi-threaded batch ops.

From a git checkout

git clone https://github.com/ilyajob05/turboquant-space
cd turboquant-space
uv sync                       # or: pip install -e .

Same story: local builds use -march=native by default. Pass -DTURBOQUANT_PORTABLE=ON to CMake if you need a portable baseline instead.


API

Everything lives on a single class, TurboQuantSpace. All numpy arrays are float32, C-contiguous; all codes are uint8.

TurboQuantSpace(
    dim: int,                    # input dimensionality (any positive integer)
    bits_per_coord: int = 4,     # 2..9 — nibble-packed for bits<=4
    rot_seed: int = 42,          # Hadamard rotation seed
    qjl_seed: int = 137,         # QJL sign seed
    num_threads: int = 0,        # 0 = use OMP_NUM_THREADS / all cores
)
method shape in shape out
encode(x) (dim,) (code_size_bytes,) uint8
encode_batch(X) (n, dim) (n, code_size_bytes) uint8
encode_into(x, out) / encode_batch_into in-place into caller buffer
distance(query, code) (dim,), (code_size,) float
distance_symmetric(code_a, code_b) (code_size,) ×2 float
distance_1_to_n(q, codes) (dim,), (n, code_size) (n,) float32
distance_m_to_n(Q, codes) (m, dim), (n, code_size) (m, n) float32
distance_m_to_n_symmetric(codes_a, b) (m, cs), (n, cs) (m, n) float32

Accessors: dim(), padded_dim(), padded(), num_threads(), code_size_bytes(), bits_per_coord().

Dimensionality padding

Internally every operation works in a power-of-two dimension (a requirement of the Walsh–Hadamard transform). If you pass dim=100, the space rounds up to 128 and zero-pads on the fly; a one-time warning is printed, and space.padded_dim() reports the internal size. Correctness is preserved — zero-padding in ℝᵈ does not change L2 distances — but encode/query cost is determined by padded_dim(), not dim().

Threading

All batch methods (encode_batch, distance_1_to_n, distance_m_to_n, distance_m_to_n_symmetric) parallelize the outer loop with OpenMP, schedule(static), so each thread owns a contiguous range of codes — prefetcher-friendly, no false sharing on output rows. Set num_threads in the constructor, or leave it 0 to respect OMP_NUM_THREADS. For small batches (≤ 64) execution stays single-threaded to avoid fork/join overhead.

Observed scaling on Apple M-series, dim=512, 50k codes × 128 queries, bits=8: 1→2 = 1.94×, 1→4 = 3.49×, 1→8 = 4.50× — see python/benchmarks/ for the full reproduction.


Benchmarks

uv run python python/benchmarks/run_benchmark.py

On first run this downloads SIFT1M (~170 MB) to ~/.cache/turboquant/sift/; subsequent runs reuse the cache. The script sweeps bits_per_coord × num_threads on SIFT1M (with recall@{1,10,100} against the shipped ground truth) and on synthetic Gaussian data across several dimensions, writes python/benchmarks/results/results_<timestamp>.csv, and produces seaborn plots under results/plots/:

  • threading_scaling.png — M-to-N throughput vs num_threads, faceted by dim.
  • sift_recall.png — recall@{1,10,100} vs bits on SIFT1M.
  • synthetic_throughput.png — encode / 1-to-N / M-to-N vs dim.

Useful flags: --skip-sift, --skip-synthetic, --threads 1,4,8, --bits 4,8, --no-show (for headless CI).

Measured numbers from real hardware (Apple M3 and more as they come in) live in docs/benchmarks.md. Headline from M3, dim=128, batch=10000, bits=8: ~88M symmetric M-to-N ops/sec and ~2.8M encode/sec on a single laptop.


Layout and build

include/turboquant/
  turbo_quant.h          # Hadamard, Lloyd–Max, TurboQuantCode
  space_turbo_quant.h    # TurboQuantSpace + SIMD distance kernels
python/turboquant/
  bindings.cpp           # pybind11 bindings
  __init__.py
python/tests/            # pytest suite
python/benchmarks/       # run_benchmark.py (CSV + seaborn plots)
CMakeLists.txt           # scikit-build-core entry point
pyproject.toml

The library is header-only in spirit — all algorithmic code is in include/turboquant/. Only the Python module (bindings.cpp) is compiled as a shared object. A C++ consumer can depend on the headers alone and call the same API directly.

Build flags worth knowing:

  • -DTURBOQUANT_HAVE_OPENMP — set by CMake when OpenMP is detected; enables all #pragma omp blocks. Absent → sequential fallback, same API.
  • Release build uses -O3 -ffast-math -fno-finite-math-only. The fno-finite-math-only is intentional: it keeps inf/nan handling sane while preserving vectorization.

Tests

uv run pytest python/tests/ -v

Covers asymmetric/symmetric distances across bits ∈ {4, 8} and dim ∈ {32..4096}, batch variants, zero-copy torch interop, and padding correctness.


Roadmap

The immediate priorities, in order:

  1. Publish wheels to PyPI (cibuildwheel workflow in place; awaiting first tagged release)

Contributions welcome. The codebase is small (two headers, one bindings file, ~2k lines) and deliberately kept that way — if a change makes it harder to read, that is a reason to push back on it.


License

MIT. See LICENSE.

Citation

If you use this library in academic work, please cite the original TurboQuant paper (ICLR 2026) in addition to this repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turboquant_space-0.1.0.tar.gz (157.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

turboquant_space-0.1.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (246.4 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

turboquant_space-0.1.0-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl (223.0 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.26+ ARM64manylinux: glibc 2.28+ ARM64

turboquant_space-0.1.0-cp313-cp313-macosx_15_0_x86_64.whl (392.6 kB view details)

Uploaded CPython 3.13macOS 15.0+ x86-64

turboquant_space-0.1.0-cp313-cp313-macosx_15_0_arm64.whl (346.3 kB view details)

Uploaded CPython 3.13macOS 15.0+ ARM64

turboquant_space-0.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (246.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

turboquant_space-0.1.0-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl (223.0 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.26+ ARM64manylinux: glibc 2.28+ ARM64

turboquant_space-0.1.0-cp312-cp312-macosx_15_0_x86_64.whl (392.5 kB view details)

Uploaded CPython 3.12macOS 15.0+ x86-64

turboquant_space-0.1.0-cp312-cp312-macosx_15_0_arm64.whl (346.2 kB view details)

Uploaded CPython 3.12macOS 15.0+ ARM64

turboquant_space-0.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (247.2 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

turboquant_space-0.1.0-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl (223.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.26+ ARM64manylinux: glibc 2.28+ ARM64

turboquant_space-0.1.0-cp311-cp311-macosx_15_0_x86_64.whl (390.9 kB view details)

Uploaded CPython 3.11macOS 15.0+ x86-64

turboquant_space-0.1.0-cp311-cp311-macosx_15_0_arm64.whl (345.5 kB view details)

Uploaded CPython 3.11macOS 15.0+ ARM64

File details

Details for the file turboquant_space-0.1.0.tar.gz.

File metadata

  • Download URL: turboquant_space-0.1.0.tar.gz
  • Upload date:
  • Size: 157.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for turboquant_space-0.1.0.tar.gz
Algorithm Hash digest
SHA256 854fc20fb3d3b727ce689e3804b1f6f79a0e87c46263aeff5621cc092b93c1b5
MD5 fb6ec04c676d7c8ccf0552ebc0e29cc7
BLAKE2b-256 2bdd90cb6c08227f12a67a65c5db9759b6e43e78a72b7de0e12d53d95eaa71f7

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0.tar.gz:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 62bc5bb140792b6157237544a025f33c12284cbf8bbe4a8d2dc6d131cbb9291f
MD5 f7c98274ff8ee884f11f2f53b1dabab8
BLAKE2b-256 8d7e1cb4faedb0b2320b9bf518c49182437c787950b59e19f0b9a32bd80ebfb6

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f3d680046fa8a95a3580cb10c10fbfd113fe9fcd2d5f1b902b45f2c24fde29a9
MD5 765ad2c4ad7aef6d3c911337a8730901
BLAKE2b-256 12821750e19028056ab3717c6918f7bd4faac2289a1132c1cf5b91911798f8a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp313-cp313-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp313-cp313-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 95ec1b68afe7e9c04f0a63127efce68a1fc169135ceb96160f0837f35ad2439d
MD5 baaa183040b115206bc64090f7d97bd9
BLAKE2b-256 f9e113b1f573698ebd9c6b9cfdcc24bd9cb6e7e1193d55848655755a4cde3dd4

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp313-cp313-macosx_15_0_x86_64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp313-cp313-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp313-cp313-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 63313d9bde5003f70af9e014fd660d1a2ee6694dabdef4689206089eb75d901f
MD5 1b3b3b111e4c05534eb13ed35e1d8104
BLAKE2b-256 e10836bfd5dd641f6066681be7b180c6f667cf38ded569cb49be613c241d0a14

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp313-cp313-macosx_15_0_arm64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c4b219349b287016053e3ebcf3bb2b1804e6a8db21818bdf826661c75d896f53
MD5 ec037e9abe5163c02f30776795862a1d
BLAKE2b-256 4392a845870da934d6a70474a6e3bcfc0c2f83315d6a8e0bfb872d7fad87a0ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9be23309455ca81702553b3343af992313378acd9c713182b70f62a609cfa8f4
MD5 900e232ff275d13fe417f8d969836eff
BLAKE2b-256 dc29087bcdba5cd6d3f2dad13f68fe1353d5e74c81511891aadd8e73bf8bc00f

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp312-cp312-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp312-cp312-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 6f4fb0e367abffc1c18fdafec1d45724924f08754f19431f8935aad1c4a9ac92
MD5 4e390d716d6a6ef3946c405386069848
BLAKE2b-256 64bb3acd0a6539a71978e49dab1521ead416ee2fedd1190f30e3b194e7b250df

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp312-cp312-macosx_15_0_x86_64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp312-cp312-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp312-cp312-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 a9d5532d2f921bf6915a186158c8b9aa99b6739c487583e8ac37694f684668a8
MD5 772892ef02d15aa2a4608899d52b9450
BLAKE2b-256 3612d3da1c4e37a6bcd5b8b612706eb5cc19447ca84ecee264cd61ef16ab584c

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp312-cp312-macosx_15_0_arm64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8bd3fc57a7bf43001aec606911b7bae31534ad61a31eb71eaf7dcff107f15dd2
MD5 b5d8ac09aa5fe820844e8dabafce8d5e
BLAKE2b-256 0d27117c47171e813ac4ca2c4c223226ba69d7294d4533c9db2b58e7e859b826

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 40778f3caf18778f29c3aafbd31cfdb5cf64d301e6906f218bc9d01843508efc
MD5 a65ec9063e29c0dc805ab9635d5843d1
BLAKE2b-256 598d339f69029a17a7ff8a5c3bd962a2f3f408a1fa3c2270ab977fbc21650b4c

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp311-cp311-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp311-cp311-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 09844a5add8fb34c273ccb9110f1a923e44f622971ad94ea69c16207836d1678
MD5 17a293e085eda7006429a9f8243f430d
BLAKE2b-256 931931bb6cb1cd431d67703abec1ba24182d92126c6c22321008c6633aab5947

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp311-cp311-macosx_15_0_x86_64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboquant_space-0.1.0-cp311-cp311-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for turboquant_space-0.1.0-cp311-cp311-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 4faed939a5978f6afdebbcdf85f53bd9461040b68cf5d48ea995ca9e02189b49
MD5 00e19f2dacaa93adae953547812716b5
BLAKE2b-256 2f055eb217572e568f24574fdd0955791470851348178c034f2ed6045f87ee97

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboquant_space-0.1.0-cp311-cp311-macosx_15_0_arm64.whl:

Publisher: publish.yml on ilyajob05/turboquant-space

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page