Fast and memory-efficient ANN with a subset-search functionality
Project description
Rii
Reconfigurable Inverted Index (Rii): fast and memory efficient approximate nearest neighbor search method with a subset-search functionality.
Reference:
- Y. Matsui, R. Hinami, and S. Satoh, "Reconfigurable Inverted Index", ACM Multimedia 2018 (oral). [paper] [supplementary] [project]
Summary of features
- Fast and memory efficient ANN. Can handle billion-scale data on memory at once. The search is less than 10 ms.
- Can run the search over a subset of the whole database
- Remain fast even after a large number of vectors are newly added (i.e., the data structure can be reconfigured)
Installing
You can install the package via pip. This library works with Python 3.5+ on linux.
pip install rii
Documentation
Example
Basic ANN
import rii
import numpy as np
N, D = 10000, 128
X = np.random.random((N, D)).astype(np.float32) # 10,000 128-dim vectors
q = np.random.random((D,)).astype(np.float32) # a 128-dim vector
# Instantiate with M=32 sub-spaces
e = rii.Rii(M=32)
# Train with the top 1000 vectors
e.fit(vecs=X[:1000])
# Add vectors
e.add_reconfigure(vecs=X)
# Search
ids, dists = e.query(q=q, topk=5)
print(ids, dists) # e.g., [1965 951 1079] [11.77537251 13.04397392 13.06065941]
Subset search
# The search can be conducted over a subset of the database
target_ids = np.array([85, 132, 236, 551, 694, 728, 992, 1234]) # Specified by IDs
ids, dists = e.query(q=q, topk=3, target_ids=target_ids)
print(ids, dists) # e.g., [728 85 132] [14.80522156 15.92787838 16.28690338]
Reconfigure
# Add new vectors
X2 = np.random.random((1000, D)).astype(np.float32)
e.add(vecs=X2) # Now N is 11000
e.query(q=q) # Ok. (0.12 msec / query)
# However, if you add quite a lot of vectors, the search might become slower
# because the data structure has been optimized for the initial item size (N=10000)
X3 = np.random.random((1000000, D)).astype(np.float32)
e.add(vecs=X3) # A lot. Now N is 1011000
e.query(q=q) # Slower (0.96 msec/query)
# In such case, run the reconfigure function. That updates the data structure
e.reconfigure()
e.query(q=q) # Ok. (0.21 msec / query)
Author
Citation
@inproceedings{rii,
author = {Yusuke Matsui and Ryota Hinami and Shin'ichi Satoh},
title = {Reconfigurable Inverted Index},
booktitle = {ACM International Conference on Multimedia (ACMMM)},
year = {2018},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
rii-0.1.0.tar.gz
(25.7 kB
view hashes)