Skip to main content

Python bindings for UltraLogLog, a space-efficient alternative to HyperLogLog for approximate distinct counting

Project description

UltraLogLog

Crates.io PyPI Documentation

Rust implementation of the UltraLogLog algorithm. Ultraloglog is more space efficient than the widely used HyperLogLog, but can be slower. FGRA estimator or MLE estimator can be used.

Usage

use ultraloglog::{Estimator, MaximumLikelihoodEstimator, OptimalFGRAEstimator, UltraLogLog};

let mut ull = UltraLogLog::new(6).unwrap();

ull.add_value("apple")
    .add_value("banana")
    .add_value("cherry")
    .add_value("033");
let est = ull.get_distinct_count_estimate();

The serde feature can be activated so that the sketch can be saved to disk and then loaded.

use ultraloglog::{Estimator, MaximumLikelihoodEstimator, OptimalFGRAEstimator, UltraLogLog};
use std::fs::{remove_file, File};
use std::io::{BufReader, BufWriter};

let file_path = "test_ultraloglog.bin";

// Create UltraLogLog and add data
let mut ull = UltraLogLog::new(5).expect("Failed to create ULL");
ull.add(123456789);
ull.add(987654321);
let original_estimate = ull.get_distinct_count_estimate();

// Save to file using writer
let file = File::create(file_path).expect("Failed to create file");
let writer = BufWriter::new(file);
ull.save(writer).expect("Failed to save UltraLogLog");

// Load from file using reader
let file = File::open(file_path).expect("Failed to open file");
let reader = BufReader::new(file);
let loaded_ull = UltraLogLog::load(reader).expect("Failed to load UltraLogLog");
let loaded_estimate = loaded_ull.get_distinct_count_estimate();

Python Bindings

This crate also provides Python bindings for the UltraLogLog algorithm using PyO3. See example.py for usage.

import ultraloglog

# Create a new UltraLogLog sketch
ull = ultraloglog.PyUltraLogLog(12)  # precision parameter

# Add values
ull.add_str("hello")
ull.add_int(42)
ull.add_float(3.14)

# Get estimated count
print(f"Estimated distinct count: {ull.count()}")

Installation

Using pip

This package is available as ultraloglog in PyPI. You can install it using:

pip install ultraloglog

From Source

uv is recommended to manage virtual environments.

  1. Install Rust, and maturin pip install maturin
  2. Build and install: maturin develop --release

64-bit hash function

As mentioned in the paper, high quality 64-bit hash function is key to ultraloglog algorithm. We tested several modern 64-bit hash libraries and found that xxhash-rust (default) and wyhash-rs worked well. However, users can easily replace the default xxhash-rust with polymurhash, komihash, ahash and t1ha et.al. See testing section for details.

Reference

Ertl, O., 2024. UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting. Proceedings of the VLDB Endowment, 17(7), pp.1655-1668.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ultraloglog-0.1.6-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (291.8 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

ultraloglog-0.1.6-cp314-cp314-macosx_11_0_arm64.whl (258.1 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

ultraloglog-0.1.6-cp314-cp314-macosx_10_12_x86_64.whl (264.2 kB view details)

Uploaded CPython 3.14macOS 10.12+ x86-64

ultraloglog-0.1.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (292.4 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

ultraloglog-0.1.6-cp313-cp313-macosx_11_0_arm64.whl (258.9 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ultraloglog-0.1.6-cp313-cp313-macosx_10_12_x86_64.whl (265.7 kB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

ultraloglog-0.1.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (292.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

ultraloglog-0.1.6-cp312-cp312-macosx_11_0_arm64.whl (259.1 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

ultraloglog-0.1.6-cp312-cp312-macosx_10_12_x86_64.whl (265.6 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

ultraloglog-0.1.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (293.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

ultraloglog-0.1.6-cp311-cp311-macosx_11_0_arm64.whl (259.0 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

ultraloglog-0.1.6-cp311-cp311-macosx_10_12_x86_64.whl (265.2 kB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

ultraloglog-0.1.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (293.7 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

ultraloglog-0.1.6-cp310-cp310-macosx_11_0_arm64.whl (259.2 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

ultraloglog-0.1.6-cp310-cp310-macosx_10_12_x86_64.whl (265.4 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file ultraloglog-0.1.6-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 47db211ebf94ac8d2b93a45a08e2ad22d46b345b754182e0cdefcb616117a5f8
MD5 bc2ce8d9b5fdbb3226b8690ad20323f2
BLAKE2b-256 fa3e33ae3a67ba320b010d8773bc420714c759f55f1b12543a6666a9a97bce85

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 da88f528011a011e3a381f300116fadca1dfef2d1d475f52c31bcb5d3b6381e3
MD5 48b400e26d6a49796039901f2bb78102
BLAKE2b-256 3eeb4551c439f81f6f70bbe1673cccd3f099f06446c10d4013fb5f4a14586090

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp314-cp314-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp314-cp314-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 602e5a7f5a6c8e3a2fd485dbedb054eae5ee06c71b869161b3bbcdf3b5b6b5b3
MD5 9e3d2469faaa479f9d9f395fb5326898
BLAKE2b-256 2675cd30997a1955b69c7d70f944edf4deb599ddaf24e853f5668f000d2800cf

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6ea4f89e2b4f0e146e3dce9435d5adda41f97147929913c50900102dac11effc
MD5 484e2cdb5a54831e8d9fa31646a1a680
BLAKE2b-256 f9ce18865e2f82b75844392a54f37a31678070ae3e443de295c71680c2e278da

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3c1a71d2cff5daf490c105a8935dfd639fc78986bf00593d9ca09bf5cd6eea8c
MD5 1741d1fe19aa384a8ec4eb193b20c0ed
BLAKE2b-256 b4e9ad3d349f31160a6a577560e3f92c54fb01eb8c381250955d53c9b5cdc94c

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 805754be53e75e8a6d7104dfc7fa3c694029a1a94ac0ab8114e6c50cc49c3d0d
MD5 6b7627b7b0fa9374838019c2f1d42bc3
BLAKE2b-256 47599d1c602f9ec113fb4b52d5e7832445b4d99fc1adca1042bcfe08336e761f

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d1e0ee6178b2aef4db7cbb5975f65c4a9113b841630ad51146172949a3bd1a12
MD5 ea68e1a637ee5d50f33d637f49272ce1
BLAKE2b-256 8f589c52a3d2a1fa374e4da28090f67444b8cb2ff5b5af08328043ef5bdea575

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8f10cd22aa395119ce8893ce22f1563d87553796684ffce1703948bc1b82f908
MD5 b76622d97c2609fa147359cbdc0247c2
BLAKE2b-256 40ea4b943491c822b38416f06ab05fbf181d89b7a35c67df047063d20324282c

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 62d0e72dc64427c63d1edb19bbaa96f9f4b0f6dc32542f448dddfdf96a7eff48
MD5 4cfce28d75516f872e2832ea2ce14458
BLAKE2b-256 26a454dd189d7afcdb53ed31b006f6887910f69fcf676f1a0f9bb0f581aaf355

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 36a0596945a097218534643cb1d9b401422eafed788ad8bec20b46825cd0faf3
MD5 0e93fc7ef38d7f99c0309a51856eb0eb
BLAKE2b-256 929e30a7fa85ef123731cc15794ea8f6f354d4a3554ccc831212e5689edbac5e

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dd0b345bce6f03e78565a198f832c5b28f294c6599df0aef8a353a98ba2b2c0d
MD5 db37393a853f261404139afd1d280169
BLAKE2b-256 3a1426c77989492e220f50b14286cdc69389853c77874589394dc28585d4ee10

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 172340b427fc11e570c2f31f3f70d1f0877131952d1c2945426b366e89e172c0
MD5 181e8bd949d49bd140df78decec9713e
BLAKE2b-256 29fae469d68edf9a1c16093eadc5cab8ef0a614ecc5069e03a33937d9ffa9f43

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0f394ae5d6e0ac9fc53752eba8fc2276da904b1ad55d218bcaa6257b86661640
MD5 2b43f7276cde21bb8a9657118cbe8c57
BLAKE2b-256 0446327b32c107119dab4bc0a303eba7be6ab66ca732de66fa03642ad028fc7c

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ad49d79a2eba8786e9454258eda0a9ebe69efab21be3a2e446d9e04c25d7f461
MD5 15671ce8c691ccbdd6a3aa713664afdf
BLAKE2b-256 537b2b72e4eb01483884f65e3e35b555327939140a1072d3c7acc35258cbb198

See more details on using hashes here.

File details

Details for the file ultraloglog-0.1.6-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ultraloglog-0.1.6-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 7f3a3f08fa3f256ef4ee7bf81f46729575451027864f17c9c68cdec07107f5da
MD5 f3b1bd8c09387e262bdecb20a5b7240e
BLAKE2b-256 a01f32f557f83b037bb324204b456cbff2efe600acd817ba593b5de58801a0f4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page