Skip to main content

Python library for the HyperLogLog algorithm

Project description

python-hll

https://img.shields.io/pypi/v/python_hll.svg Documentation Status Updates

A Python implementation of HyperLogLog whose goal is to be storage compatible with java-hll, js-hll and postgresql-hll.

NOTE: This is a fairly literal translation/port of java-hll to Python. Internally, bytes are represented as Java-style bytes (-128 to 127) rather than Python-style bytes (0 to 255). Also this implementation is quite slow: for example, in Java HLLSerializationTest takes 12 seconds to run while in Python test_hll_serialization takes 1.5 hours to run (about 400x slower).

Getting started

$ mkvirtualenv python_hll
$ python setup.py develop
$ pip install -r requirements_dev.txt

Run tests:

$ make lint
$ make test-fast

To run one test file or one test:

$ py.test --capture=no tests/test_sparse_hll.py
$ py.test --capture=no tests/test_sparse_hll.py::test_add

To run slow tests:

$ make test

History

0.0.0 (2019-06-14)

  • Submitted to AdRoll HackWeek.

0.1.0 (2019-09-12)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_hll-0.1.0.tar.gz (2.1 MB view hashes)

Uploaded Source

Built Distribution

python_hll-0.1.0-py2.py3-none-any.whl (26.1 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page