Fast and memory-efficient clustering
Project description
PQk-means [Matsui, Ogaki, Yamasaki, and Aizawa, ACMMM 17] is a Python library for efficient clustering of large-scale data. By first compressing input vectors into short product-quantized (PQ) codes, PQk-means achieves fast and memory-efficient clustering, even for high-dimensional vectors. Similar to k-means, PQk-means repeats the assignment and update steps, both of which can be performed in the PQ-code domain. For a comparison, we provide the ITQ encoding for the binary conversion and Binary k-means [Gong+, CVPR 15] for the clustering of binary codes. The library is written in C++ for the main algorithm with wrappers for Python. All encoding/clustering codes are compatible with scikit-learn.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pqkmeans-1.0.6.tar.gz
.
File metadata
- Download URL: pqkmeans-1.0.6.tar.gz
- Upload date:
- Size: 224.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 17bd0ba0b03b01d37df67dc55fa44272a85a135e0326f15c932a8ffd20f8ca90 |
|
MD5 | c6254b03fc16efdd0793e0fed475b3e9 |
|
BLAKE2b-256 | d17f50e7590db705f7cc3612229fa5c403dd116225355b8ef1c8c2c30f8ba3a5 |