Skip to main content

Fast and memory-efficient clustering

Project description

PQk-means [Matsui, Ogaki, Yamasaki, and Aizawa, ACMMM 17] is a Python library for efficient clustering of large-scale data. By first compressing input vectors into short product-quantized (PQ) codes, PQk-means achieves fast and memory-efficient clustering, even for high-dimensional vectors. Similar to k-means, PQk-means repeats the assignment and update steps, both of which can be performed in the PQ-code domain. For a comparison, we provide the ITQ encoding for the binary conversion and Binary k-means [Gong+, CVPR 15] for the clustering of binary codes. The library is written in C++ for the main algorithm with wrappers for Python. All encoding/clustering codes are compatible with scikit-learn.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pqkmeans-1.0.6.tar.gz (224.3 kB view details)

Uploaded Source

File details

Details for the file pqkmeans-1.0.6.tar.gz.

File metadata

  • Download URL: pqkmeans-1.0.6.tar.gz
  • Upload date:
  • Size: 224.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for pqkmeans-1.0.6.tar.gz
Algorithm Hash digest
SHA256 17bd0ba0b03b01d37df67dc55fa44272a85a135e0326f15c932a8ffd20f8ca90
MD5 c6254b03fc16efdd0793e0fed475b3e9
BLAKE2b-256 d17f50e7590db705f7cc3612229fa5c403dd116225355b8ef1c8c2c30f8ba3a5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page