Skip to main content

Efficient implementation of IVF + IVFPQ Index with numpy and numba

Project description

FastIVF

Efficient implementation of IVF Index with numpy and numba

Installation

  • Install numpy and numba from conda to use intel mkl libraries for linear algebra operations
  • To install package from the source code run pip install .
  • To install from pip run pip install fast-ivf
  • You may need to install tensorflow>=2.13, see CompressedFastIVF for details
  • code tested with python==3.11
  • see notebook test-index for Index usage examples
  • see notebook test-kmeans for K-means usage examples

Features / limitations

  • This is an experimental code which heavily relies on numba and numpy and may contain bugs
  • IVF centroids are estimated with custom mini batch kmeans implementation
    • MiniBatchKMeans is used to estimate centroids of standard Inverted Index
    • SubspaceMiniBatchKMeans is used to estimate centroids of Product Quantization Index
  • K-means implementations support only l2 or cosine distances
  • All indices currently support only cosine distance

Results on custom benchmark data

  • Resources restricted to OMP_NUM_THREADS=MKL_NUM_THREADS=OPENBLAS_NUM_THREADS=12 which was consuming 100% in our case for fast-ivf and faiss
  • Train vectors: internal ~900k vectors of dim=1024, normalized to unit length
  • Test vectors: same but 40k vectors
  • Hyperparams: nprobe=10, ratio_threshold=0.5, no re-scoring is used for approximated indices (for mini-batch kmeans we use repository defaults), for CompressionFastIVF we use compression_ndim=128 (which gives 8 times compression ratio)
  • We measure recall@10, as function which checks if exact_i is in top_indices[:10] for each test query, then we average the results over all test vectors
  • For faiss I used similar parameters for nlist, m, nbits etc
  • Reported time is computed from average of 5 runs, divided by 40k to get the time per single query
  • As we use numba internally, each Fast-Index is initialized with warmup call to compile the code
  • Note: CompressedFastIVF requires to train small neural network to compress embeddings to lower dimensionality, which increases the index build time
  • For both libraries each search() call was consuming all 40k vectors, to fully utilize all vectorization
Index Recall@10 Query Time (ms) Params
FastIVF 0.964 0.100 nlist=1024, nprobe=10, ratio_threshold=0.5
Faiss IVF 0.968 1.000 nlist=1024, nprobe=10
FastIVFPQ 0.802 0.100 nlist=1024, nprobe=10, ratio_threshold=0.5, pq_num_subvectors=32, pq_num_centroids=128
Faiss IVFPQ 0.864 0.220 nlist=1024, nprobe=10, m=32, nbits=7
CompressedFastIVF 0.933 0.050 nlist=1024, nprobe=10, ratio_threshold=0.5, compression_ndim=128
CompressedFastIVF 0.889 0.040 nlist=1024, nprobe=10, ratio_threshold=0.5, compression_ndim=64

Custom mini batch k-means implementation

Efficient mini-batch kmeans implementations with numba and numpy

from fast_ivf.kmeans import MiniBatchKMeans
import numpy as np

kmeans = MiniBatchKMeans(num_centroids=16, batch_size=32, metric="l2")
data = np.random.rand(5000, 64)
kmeans.train(data)
kmeans.add(data)
labels = kmeans.predict(data)

Efficient mini-batch kmeans implementations to train product quantization centroids

from fast_ivf.kmeans import SubvectorsMiniBatchKMeans
import numpy as np

kmeans = SubvectorsMiniBatchKMeans(num_centroids=16, num_subvectors=8, batch_size=32, metric="l2")
data = np.random.rand(5000, 64)
kmeans.train(data)
kmeans.add(data)
labels = kmeans.predict(data)

FastIVF

Similar to faiss.IndexIVFFlat( faiss.IndexFlatIP(d), d, nlist, faiss.METRIC_INNER_PRODUCT)

from fast_ivf import FastIVF
from fast_ivf.core import normalize
import numpy as np

nlist = 1024
train_embeddings = normalize(np.random.rand(10000, 512).astype(np.float32))
index = FastIVF(512, nlist=nlist)
index.train(train_embeddings)

index.nprobe = 10
# greedy skip voronoi cells which are having score smaller than 0.5 of the largest score
# higher values lead to faster search but less accurate
index.ratio_threshold = 0.5

test_embeddings = normalize(np.random.rand(100, 512).astype(np.float32))
distances, indices = index.search(test_embeddings, k=100)

FastIVFPQ

Similar to faiss_index = faiss.IndexIVFPQ(faiss.IndexFlatIP(d), d, nlist, m, n_bits)

from fast_ivf import FastIVFPQ

nlist = 1024
# pq_num_centroids = 2 ** n_bits
# pq_num_subvectors = m
index = FastIVFPQ(512, nlist=nlist, pq_num_centroids=64, pq_num_subvectors=32)
index.train(train_embeddings)
index.nprobe = 10
index.ratio_threshold = 0.5
distances, indices = index.search(test_embeddings, k=100)

# compute exact scores for top 100 results, this is slower but more accurate
distances, indices = index.search(test_embeddings, k=100, rescore=True)

# calibrate scores by fitting a linear regression model to N=20 exact scores, if -1 then all scores are exactly computed
index.rescore_num_samples = 20
distances, indices = index.search(test_embeddings, k=100, rescore=True)

CompressedFastIVF

Trains keras autoencoder to compress embeddings to lower dimensionality

from fast_ivf import CompressedFastIVF

nlist = 1024
index = CompressedFastIVF(512, nlist=nlist, compression_ndim=128)
index.train(train_embeddings)
index.nprobe = 10
index.ratio_threshold = 0.5
distances, indices = index.search(test_embeddings, k=100)

# compute exact scores for top 100 results, this is slower but more accurate
distances, indices = index.search(test_embeddings, k=100, rescore=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast_ivf-1.0.1.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

fast_ivf-1.0.1-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file fast_ivf-1.0.1.tar.gz.

File metadata

  • Download URL: fast_ivf-1.0.1.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for fast_ivf-1.0.1.tar.gz
Algorithm Hash digest
SHA256 4d8760eb595bf49dccc3b0d24c748b5ad6cd21a4a43b5aa796edf58410288a32
MD5 32afa380bb420b54a63ae2c75638963f
BLAKE2b-256 41fb4bea60c9517b5650ccc4bb04ff88cae4e418999b3098711eec1fcf47217d

See more details on using hashes here.

File details

Details for the file fast_ivf-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: fast_ivf-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for fast_ivf-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f7ecb3bbaa21d270b1e7d5320c36d5392aac6655194f0285b4fed85cd87d69db
MD5 fd29cd3358d8c68bd7ce3c118017d807
BLAKE2b-256 d03b06f9507fd4d0be26443f4a127e65db8c1e2b64d137168141f58346130842

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page