Python library for the HyperLogLog algorithm
Project description
python-hll
A Python implementation of HyperLogLog whose goal is to be storage compatible with java-hll, js-hll and postgresql-hll.
NOTE: This is a fairly literal translation/port of java-hll to Python. Internally, bytes are represented as Java-style bytes (-128 to 127) rather than Python-style bytes (0 to 255). Also this implementation is quite slow: for example, in Java HLLSerializationTest takes 12 seconds to run while in Python test_hll_serialization takes 1.5 hours to run (about 400x slower).
Runs on: Python 2.7 and 3
Free software: MIT license
Documentation: https://python-hll.readthedocs.io.
Getting started
$ mkvirtualenv python_hll $ python setup.py develop $ pip install -r requirements_dev.txt
Run tests:
$ make lint $ make test-fast
To run one test file or one test:
$ py.test --capture=no tests/test_sparse_hll.py $ py.test --capture=no tests/test_sparse_hll.py::test_add
To run slow tests:
$ make test
History
0.0.0 (2019-06-14)
Submitted to AdRoll HackWeek.
0.1.0 (2019-09-12)
First release on PyPI.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for python_hll-0.1.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12247836519ccc5b58cb155df582aeb3e6209e0bb97ed530f0c2f377d163d900 |
|
MD5 | 3ece7ccea18be19de36c205ef33de90c |
|
BLAKE2b-256 | 4974707071fcfc7c6b09e15d01327def2796921005d8c1266372ac479ebcd056 |