Skip to main content

Python bindings for UltraLogLog, a space-efficient alternative to HyperLogLog for approximate distinct counting

Project description

UltraLogLog

Crates.io PyPI Documentation

Rust implementation of the UltraLogLog algorithm. Ultraloglog is more space efficient than the widely used HyperLogLog, but can be slower. FGRA estimator or MLE estimator can be used.

Usage

use ultraloglog::{Estimator, MaximumLikelihoodEstimator, OptimalFGRAEstimator, UltraLogLog};

let mut ull = UltraLogLog::new(6).unwrap();

ull.add_value("apple")
    .add_value("banana")
    .add_value("cherry")
    .add_value("033");
let est = ull.get_distinct_count_estimate();

The serde feature can be activated so that the sketch can be saved to disk and then loaded.

use ultraloglog::{Estimator, MaximumLikelihoodEstimator, OptimalFGRAEstimator, UltraLogLog};
use std::fs::{remove_file, File};
use std::io::{BufReader, BufWriter};

let file_path = "test_ultraloglog.bin";

// Create UltraLogLog and add data
let mut ull = UltraLogLog::new(5).expect("Failed to create ULL");
ull.add(123456789);
ull.add(987654321);
let original_estimate = ull.get_distinct_count_estimate();

// Save to file using writer
let file = File::create(file_path).expect("Failed to create file");
let writer = BufWriter::new(file);
ull.save(writer).expect("Failed to save UltraLogLog");

// Load from file using reader
let file = File::open(file_path).expect("Failed to open file");
let reader = BufReader::new(file);
let loaded_ull = UltraLogLog::load(reader).expect("Failed to load UltraLogLog");
let loaded_estimate = loaded_ull.get_distinct_count_estimate();

Python Bindings

This crate also provides Python bindings for the UltraLogLog algorithm using PyO3. See example.py for usage.

import ultraloglog

# Create a new UltraLogLog sketch
ull = ultraloglog.PyUltraLogLog(12)  # precision parameter

# Add values
ull.add_str("hello")
ull.add_int(42)
ull.add_float(3.14)

# Get estimated count
print(f"Estimated distinct count: {ull.count()}")

Installation

Using pip

This package is available as ultraloglog in PyPI. You can install it using:

pip install ultraloglog

From Source

uv is recommended to manage virtual environments.

  1. Install Rust, and maturin pip install maturin
  2. Build and install: maturin develop --release

64-bit hash function

As mentioned in the paper, high quality 64-bit hash function is key to ultraloglog algorithm. We tested several modern 64-bit hash libraries and found that xxhash-rust (default) and wyhash-rs worked well. However, users can easily replace the default xxhash-rust with polymurhash, komihash, ahash and t1ha et.al. See testing section for details.

Reference

Ertl, O., 2024. UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting. Proceedings of the VLDB Endowment, 17(7), pp.1655-1668.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ultraloglog-0.1.4-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (293.9 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

ultraloglog-0.1.4-cp314-cp314-macosx_11_0_arm64.whl (256.7 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

ultraloglog-0.1.4-cp314-cp314-macosx_10_12_x86_64.whl (263.9 kB view details)

Uploaded CPython 3.14macOS 10.12+ x86-64

ultraloglog-0.1.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (296.2 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

ultraloglog-0.1.4-cp313-cp313-macosx_11_0_arm64.whl (258.2 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ultraloglog-0.1.4-cp313-cp313-macosx_10_12_x86_64.whl (265.5 kB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

ultraloglog-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (294.9 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

ultraloglog-0.1.4-cp312-cp312-macosx_11_0_arm64.whl (258.2 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

ultraloglog-0.1.4-cp312-cp312-macosx_10_12_x86_64.whl (265.5 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

ultraloglog-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (296.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

ultraloglog-0.1.4-cp311-cp311-macosx_11_0_arm64.whl (259.8 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

ultraloglog-0.1.4-cp311-cp311-macosx_10_12_x86_64.whl (265.9 kB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

ultraloglog-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (296.5 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

ultraloglog-0.1.4-cp310-cp310-macosx_11_0_arm64.whl (259.8 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

ultraloglog-0.1.4-cp310-cp310-macosx_10_12_x86_64.whl (266.1 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file ultraloglog-0.1.4-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b9889d89c7be8b96cfd083da8b61cb6cb6b4df073851729b1ecb2aaf78d74b7f
MD5 4af0cfc1f4608ec6b5119524fc65e81b
BLAKE2b-256 8f3c7d77c757b2e4d5ac178e307e9cd81a4c028704e2c8dbcd2c8a39854519e5

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e24200dc5d98674d75029d32697cc7ab03bc1f16b5ae348c8cee6cf27dff2ae7
MD5 bb0f162517cebf515d19e920943831f2
BLAKE2b-256 6ae831cd6372a9d16481e5c836dd285bc4277b66dcc73ac10d4f92a0a6992470

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp314-cp314-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp314-cp314-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a271edb7f8a754efce216bae4b9a01ea213958836a24fa90dcca93630928a9e5
MD5 a1f5ba654c01464b47b9580c56b5a42b
BLAKE2b-256 9e6cec6f648f63ca64f9f1b5ef43ef8bf4f022688cc7ee7cda63a1f145508ec2

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 50c5059e759f150caed0ca6b2a283185cae1ba116bb6769a95c8d362e4050102
MD5 52091b38e5df772b35bfdd2646702b6c
BLAKE2b-256 4e70d678e13776a602c08e8006936b52df79a2defe7f06a35b692a5278eee5c2

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 466dd7564fe262b2063205f280bc5394190b5f92d7eda2b59d9bd5702c3665f4
MD5 7ca047276dc05515681a4d1563252356
BLAKE2b-256 fee544005616506c45a1ed6ba0e926f100daac8c117fbb1ebe3c0ea933e9fc2f

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ff17631324039e4e612c52cdcbb99668dea839db5f71b0b2f6cebead6359188b
MD5 a03e31093edc388d717ab356cd8f2854
BLAKE2b-256 2bc2ee61269ed8136453b748131393aba3fdb577405d6e1d3299e3124dd0737c

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3afca61c8c74f95d3d95ae8d43b92cb11744a7cf43cfe1de3001f8e26b52df69
MD5 ba7b1d18e28ec00de3b9c625dbb56eb2
BLAKE2b-256 53bd90866a9f3abc2d7b590f99c8ce68556dbd8afcee64aa14320e87fb7b2d53

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 07332fc330debd03381669b49d87e9f477efc21444137b5e35c1717a9bd0cff4
MD5 348fbce49bfdbe00617b03fe1f10ec16
BLAKE2b-256 c8d960a7c7defeedfd30de8068b4906de9a9bd85a4d065b64bd28adbf5beaad5

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 889b92ef0b1117d7ad42663c746659bb562a3425fe7630ac4a87816d886b4e66
MD5 85a05d852ec99b613240ef66c2e0de87
BLAKE2b-256 866e4cef1aee48343f10ab3ef66da7527b61edb0d2024d823fd6c7dc023fb3b8

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 15c941ee49dbb6034d445380f731b5d6f552bad38249d51e5dfa2e84ff7c815f
MD5 868bce901bebd5377476922415d05ed8
BLAKE2b-256 b7092f38d14cc213c78aa8fab89c9436c3a0b871e5f4fbcd0798e63c4e05de7a

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ad3a348061d7b65cbe5484115bcd2f05e72153c9599a2586449916b4b09348fb
MD5 74429281e1c338d1e76f6a9ba3117b3d
BLAKE2b-256 752fa7baabf162353f3f54e62ecaae497d3672924c2e9791d856d59b78a00e3c

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 52ef3320551e68a6880b3c842a04978fd87d5266ea3b24703345757f52ea916c
MD5 fbd216479f8eb9cb3a2fc6aa6300b6b1
BLAKE2b-256 178f7428ccdba4b7007397a15bb655f9e40779b239a8ef7716274b893576cd50

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 85769de09a2f439588a2894a3f76a3d2d7f896af19c4b23970d6ed1532293b59
MD5 86144a91ea16db287e8e0f0a18c79b03
BLAKE2b-256 eaafb2ef95cc8e978b2a628596e1f9e372260c3778979cd1c42595f363b30f83

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9ad1e177ed7ed66eb0541ea397d3c2fc7cd85e173649323a63d0749a44a0bc26
MD5 3b22956677531b24d0c8cb7bead7567c
BLAKE2b-256 c23cd59969e2a445c6894317055cede1ff323da6def2c602aef4c9b18ca697db

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.4-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.4-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 6b861853694e64b93c5e5afd56e3f86df8bf0dd5b58b5408f13bcb53ec8f04e3
MD5 c17c76b1d55d226f68be782ae029324d
BLAKE2b-256 98ed55bc4139a39b85a67d8ffde6c9c64a9324c6db80c67c2f830e30b580f65c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page