Pure python implementation of product quantization for nearest neighbor search
Nano Product Quantization (nanopq): a vanilla implementation of Product Quantization (PQ) and Optimized Product Quantization (OPQ) written in pure python without any third party dependencies.
You can install the package via pip. This library works with Python 3.5+ on linux.
pip install nanopq
import nanopq import numpy as np N, D = 10000, 128 X = np.random.random((N, D)).astype(np.float32) # 10,000 128-dim vectors query = np.random.random((D,)).astype(np.float32) # a 128-dim vector # Instantiate with M=8 sub-spaces pq = nanopq.PQ(M=8) # Train with the top 1000 vectors pq.fit(X[:1000]) # Encode to PQ-codes X_code = pq.encode(X) # (10000, 8) with dtype=np.uint8 # Results: create a distance table online, and compute Asymmetric Distance to each PQ-code dists = pq.dtable(query).adist(X_code)
- H. Jegou, M. Douze, and C. Schmid, "Product Quantization for Nearest Neighbor Search", IEEE TPAMI 2011 (the original paper of PQ)
- T. Ge, K. He, Q. Ke, and J. Sun, "Optimized Product Quantization", IEEE TPAMI 2014 (the original paper of OPQ)
- Y. Matsui, Y. Uchida, H. Jegou, and S. Satoh, "A Survey of Product Quantization", ITE MTA 2018 (a survey paper of PQ)
- PQ in faiss (Faiss contains an optimized implementation of PQ. See the difference to ours here)
- Rayuela.jl (Julia implementation of several encoding algorithms including PQ and OPQ)
- PQk-means (clustering on PQ-codes. The implementation of nanopq is compatible to that of PQk-means)
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size nanopq-0.1.5-py3-none-any.whl (6.9 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size nanopq-0.1.5.tar.gz (6.4 kB)||File type Source||Python version None||Upload date||Hashes View hashes|