Python bindings for UltraLogLog, a space-efficient alternative to HyperLogLog for approximate distinct counting
Project description
UltraLogLog
Rust implementation of the UltraLogLog algorithm. Ultraloglog is more space efficient than the widely used HyperLogLog, but can be slower. FGRA estimator or MLE estimator can be used.
Usage
use ultraloglog::{Estimator, MaximumLikelihoodEstimator, OptimalFGRAEstimator, UltraLogLog};
let mut ull = UltraLogLog::new(6).unwrap();
ull.add_value("apple")
.add_value("banana")
.add_value("cherry")
.add_value("033");
let est = ull.get_distinct_count_estimate();
The serde feature can be activated so that the sketch can be saved to disk and then loaded.
use ultraloglog::{Estimator, MaximumLikelihoodEstimator, OptimalFGRAEstimator, UltraLogLog};
use std::fs::{remove_file, File};
use std::io::{BufReader, BufWriter};
let file_path = "test_ultraloglog.bin";
// Create UltraLogLog and add data
let mut ull = UltraLogLog::new(5).expect("Failed to create ULL");
ull.add(123456789);
ull.add(987654321);
let original_estimate = ull.get_distinct_count_estimate();
// Save to file using writer
let file = File::create(file_path).expect("Failed to create file");
let writer = BufWriter::new(file);
ull.save(writer).expect("Failed to save UltraLogLog");
// Load from file using reader
let file = File::open(file_path).expect("Failed to open file");
let reader = BufReader::new(file);
let loaded_ull = UltraLogLog::load(reader).expect("Failed to load UltraLogLog");
let loaded_estimate = loaded_ull.get_distinct_count_estimate();
Python Bindings
This crate also provides Python bindings for the UltraLogLog algorithm using PyO3. See example.py for usage.
import ultraloglog
# Create a new UltraLogLog sketch
ull = ultraloglog.PyUltraLogLog(12) # precision parameter
# Add values
ull.add_str("hello")
ull.add_int(42)
ull.add_float(3.14)
# Get estimated count
print(f"Estimated distinct count: {ull.count()}")
Installation
Using pip
This package is available as ultraloglog in PyPI. You can install it using:
pip install ultraloglog
From Source
uv is recommended to manage virtual environments.
- Install Rust, and maturin
pip install maturin - Build and install:
maturin develop --release
64-bit hash function
As mentioned in the paper, high quality 64-bit hash function is key to ultraloglog algorithm. We tested several modern 64-bit hash libraries and found that xxhash-rust (default) and wyhash-rs worked well. However, users can easily replace the default xxhash-rust with polymurhash, komihash, ahash and t1ha et.al. See testing section for details.
Reference
Ertl, O., 2024. UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting. Proceedings of the VLDB Endowment, 17(7), pp.1655-1668.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ultraloglog-0.1.4-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 293.9 kB
- Tags: CPython 3.14, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9889d89c7be8b96cfd083da8b61cb6cb6b4df073851729b1ecb2aaf78d74b7f
|
|
| MD5 |
4af0cfc1f4608ec6b5119524fc65e81b
|
|
| BLAKE2b-256 |
8f3c7d77c757b2e4d5ac178e307e9cd81a4c028704e2c8dbcd2c8a39854519e5
|
File details
Details for the file ultraloglog-0.1.4-cp314-cp314-macosx_11_0_arm64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp314-cp314-macosx_11_0_arm64.whl
- Upload date:
- Size: 256.7 kB
- Tags: CPython 3.14, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e24200dc5d98674d75029d32697cc7ab03bc1f16b5ae348c8cee6cf27dff2ae7
|
|
| MD5 |
bb0f162517cebf515d19e920943831f2
|
|
| BLAKE2b-256 |
6ae831cd6372a9d16481e5c836dd285bc4277b66dcc73ac10d4f92a0a6992470
|
File details
Details for the file ultraloglog-0.1.4-cp314-cp314-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp314-cp314-macosx_10_12_x86_64.whl
- Upload date:
- Size: 263.9 kB
- Tags: CPython 3.14, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a271edb7f8a754efce216bae4b9a01ea213958836a24fa90dcca93630928a9e5
|
|
| MD5 |
a1f5ba654c01464b47b9580c56b5a42b
|
|
| BLAKE2b-256 |
9e6cec6f648f63ca64f9f1b5ef43ef8bf4f022688cc7ee7cda63a1f145508ec2
|
File details
Details for the file ultraloglog-0.1.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 296.2 kB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50c5059e759f150caed0ca6b2a283185cae1ba116bb6769a95c8d362e4050102
|
|
| MD5 |
52091b38e5df772b35bfdd2646702b6c
|
|
| BLAKE2b-256 |
4e70d678e13776a602c08e8006936b52df79a2defe7f06a35b692a5278eee5c2
|
File details
Details for the file ultraloglog-0.1.4-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 258.2 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
466dd7564fe262b2063205f280bc5394190b5f92d7eda2b59d9bd5702c3665f4
|
|
| MD5 |
7ca047276dc05515681a4d1563252356
|
|
| BLAKE2b-256 |
fee544005616506c45a1ed6ba0e926f100daac8c117fbb1ebe3c0ea933e9fc2f
|
File details
Details for the file ultraloglog-0.1.4-cp313-cp313-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp313-cp313-macosx_10_12_x86_64.whl
- Upload date:
- Size: 265.5 kB
- Tags: CPython 3.13, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff17631324039e4e612c52cdcbb99668dea839db5f71b0b2f6cebead6359188b
|
|
| MD5 |
a03e31093edc388d717ab356cd8f2854
|
|
| BLAKE2b-256 |
2bc2ee61269ed8136453b748131393aba3fdb577405d6e1d3299e3124dd0737c
|
File details
Details for the file ultraloglog-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 294.9 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3afca61c8c74f95d3d95ae8d43b92cb11744a7cf43cfe1de3001f8e26b52df69
|
|
| MD5 |
ba7b1d18e28ec00de3b9c625dbb56eb2
|
|
| BLAKE2b-256 |
53bd90866a9f3abc2d7b590f99c8ce68556dbd8afcee64aa14320e87fb7b2d53
|
File details
Details for the file ultraloglog-0.1.4-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 258.2 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
07332fc330debd03381669b49d87e9f477efc21444137b5e35c1717a9bd0cff4
|
|
| MD5 |
348fbce49bfdbe00617b03fe1f10ec16
|
|
| BLAKE2b-256 |
c8d960a7c7defeedfd30de8068b4906de9a9bd85a4d065b64bd28adbf5beaad5
|
File details
Details for the file ultraloglog-0.1.4-cp312-cp312-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp312-cp312-macosx_10_12_x86_64.whl
- Upload date:
- Size: 265.5 kB
- Tags: CPython 3.12, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
889b92ef0b1117d7ad42663c746659bb562a3425fe7630ac4a87816d886b4e66
|
|
| MD5 |
85a05d852ec99b613240ef66c2e0de87
|
|
| BLAKE2b-256 |
866e4cef1aee48343f10ab3ef66da7527b61edb0d2024d823fd6c7dc023fb3b8
|
File details
Details for the file ultraloglog-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 296.4 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15c941ee49dbb6034d445380f731b5d6f552bad38249d51e5dfa2e84ff7c815f
|
|
| MD5 |
868bce901bebd5377476922415d05ed8
|
|
| BLAKE2b-256 |
b7092f38d14cc213c78aa8fab89c9436c3a0b871e5f4fbcd0798e63c4e05de7a
|
File details
Details for the file ultraloglog-0.1.4-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 259.8 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad3a348061d7b65cbe5484115bcd2f05e72153c9599a2586449916b4b09348fb
|
|
| MD5 |
74429281e1c338d1e76f6a9ba3117b3d
|
|
| BLAKE2b-256 |
752fa7baabf162353f3f54e62ecaae497d3672924c2e9791d856d59b78a00e3c
|
File details
Details for the file ultraloglog-0.1.4-cp311-cp311-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp311-cp311-macosx_10_12_x86_64.whl
- Upload date:
- Size: 265.9 kB
- Tags: CPython 3.11, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52ef3320551e68a6880b3c842a04978fd87d5266ea3b24703345757f52ea916c
|
|
| MD5 |
fbd216479f8eb9cb3a2fc6aa6300b6b1
|
|
| BLAKE2b-256 |
178f7428ccdba4b7007397a15bb655f9e40779b239a8ef7716274b893576cd50
|
File details
Details for the file ultraloglog-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 296.5 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
85769de09a2f439588a2894a3f76a3d2d7f896af19c4b23970d6ed1532293b59
|
|
| MD5 |
86144a91ea16db287e8e0f0a18c79b03
|
|
| BLAKE2b-256 |
eaafb2ef95cc8e978b2a628596e1f9e372260c3778979cd1c42595f363b30f83
|
File details
Details for the file ultraloglog-0.1.4-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 259.8 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ad1e177ed7ed66eb0541ea397d3c2fc7cd85e173649323a63d0749a44a0bc26
|
|
| MD5 |
3b22956677531b24d0c8cb7bead7567c
|
|
| BLAKE2b-256 |
c23cd59969e2a445c6894317055cede1ff323da6def2c602aef4c9b18ca697db
|
File details
Details for the file ultraloglog-0.1.4-cp310-cp310-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ultraloglog-0.1.4-cp310-cp310-macosx_10_12_x86_64.whl
- Upload date:
- Size: 266.1 kB
- Tags: CPython 3.10, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b861853694e64b93c5e5afd56e3f86df8bf0dd5b58b5408f13bcb53ec8f04e3
|
|
| MD5 |
c17c76b1d55d226f68be782ae029324d
|
|
| BLAKE2b-256 |
98ed55bc4139a39b85a67d8ffde6c9c64a9324c6db80c67c2f830e30b580f65c
|