Pure python implementation of product quantization for nearest neighbor search
Project description
nanopq
Nano Product Quantization (nanopq): a vanilla implementation of Product Quantization (PQ) and Optimized Product Quantization (OPQ) written in pure python without any third party dependencies.
Installing
You can install the package via pip. This library works with Python 3.5+ on linux.
pip install nanopq
Documentation
Example
import nanopq
import numpy as np
N, Nt, D = 10000, 2000, 128
X = np.random.random((N, D)).astype(np.float32) # 10,000 128-dim vectors to be indexed
Xt = np.random.random((Nt, D)).astype(np.float32) # 2,000 128-dim vectors for training
query = np.random.random((D,)).astype(np.float32) # a 128-dim query vector
# Instantiate with M=8 sub-spaces
pq = nanopq.PQ(M=8)
# Train codewords
pq.fit(Xt)
# Encode to PQ-codes
X_code = pq.encode(X) # (10000, 8) with dtype=np.uint8
# Results: create a distance table online, and compute Asymmetric Distance to each PQ-code
dists = pq.dtable(query).adist(X_code) # (10000, )
Author
Contributors
- @Hiroshiba fixed a bug of importlib (#3)
- @calvinmccarter implemented parametric initialization for OPQ (#14)
- @de9uch1 exntended the interface to the faiss so that OPQ can be handled (#19)
- @mpskex implemented (1) initialization of clustering and (2) dot-product for computation (#24)
- @lsb fixed a typo (#26)
Reference
- H. Jegou, M. Douze, and C. Schmid, "Product Quantization for Nearest Neighbor Search", IEEE TPAMI 2011 (the original paper of PQ)
- T. Ge, K. He, Q. Ke, and J. Sun, "Optimized Product Quantization", IEEE TPAMI 2014 (the original paper of OPQ)
- Y. Matsui, Y. Uchida, H. Jegou, and S. Satoh, "A Survey of Product Quantization", ITE MTA 2018 (a survey paper of PQ)
- PQ in faiss (Faiss contains an optimized implementation of PQ. See the difference to ours here)
- Rayuela.jl (Julia implementation of several encoding algorithms including PQ and OPQ)
- PQk-means (clustering on PQ-codes. The implementation of nanopq is compatible to that of PQk-means)
- Rii (IVFPQ-based ANN algorithm using nanopq)
- Product quantization in Faiss and from scratch (Related tutorial)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nanopq-0.2.1.tar.gz
(10.8 kB
view details)
Built Distribution
nanopq-0.2.1-py3-none-any.whl
(10.8 kB
view details)
File details
Details for the file nanopq-0.2.1.tar.gz
.
File metadata
- Download URL: nanopq-0.2.1.tar.gz
- Upload date:
- Size: 10.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 393d81d37ef85a2f675a72a48ec26fda6733ff9b780bc167bafa8e052c5fde67 |
|
MD5 | ca4ca794336ec1732531efa99a46f1b2 |
|
BLAKE2b-256 | bda725b278e14dc78a0c9146c9cc100f2939fa84f957b32a13523767259a44d9 |
File details
Details for the file nanopq-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: nanopq-0.2.1-py3-none-any.whl
- Upload date:
- Size: 10.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 01ca529a6fe11fa0173e7cc5093dbf6f244aab2535083e1cbdeb5e9d62e8950c |
|
MD5 | 68f4fa2098c73fb40f47d44826fc4af9 |
|
BLAKE2b-256 | cd335ae3fd2db1d7bb759e0f24e9fe5f9db0a99171f2771f70070e0383e8123c |