Skip to main content

Python bindings for the Rust blake3 crate

Project description

blake3-py Actions Status

Python bindings for the Rust blake3 crate, based on PyO3. This a proof of concept, not yet fully-featured or production-ready. See also the Soundness concerns below.

Example

How to try out this repo on the command line:

# You have to build the shared library first.
$ ./build.py

# Try out example.py.
$ echo hello world | ./example.py
dc5a4edb8240b018124052c330270696f96771a63b45250a5c17d3000e823355

# Run a few tests.
$ ./test.py

What it looks like to use blake3 in Python code:

import blake3

hash1 = blake3.blake3(b"foobarbaz").digest()

hasher = blake3.blake3()
hasher.update(b"foo")
hasher.update(b"bar")
hasher.update(b"baz")
hash2 = hasher.digest()

assert hash1 == hash2

print("The hash of 'hello world' is:",
      blake3.blake3(b"hello world").hexdigest())

Building

The build.py script runs cargo build --release and then copies the resulting shared library to a platform-appropriate name (blake3.so on Linux/macOS, and blake3.pyd on Windows) in the repo root directory. Python scripts in that directory will then load the shared library when they import blake3.

This project is not yet packaged in a way that's convenient to pip install. I need to learn more about Python packaging to understand the right way to do this. (Binary wheels?) Any help on this front from folks with more experience would be greatly appreciated.

Soundness

There are some fundamental questions about whether these bindings can be sound. Like the Python standard library's hash implementations, in order to avoid blocking other threads during a potentially expensive call to update(), we release the GIL. But that opens up the possibility that another thread might mutate, say, the bytearray we're hashing, while the Rust code is treating it as a &[u8]. That violates Rust's aliasing guarantees and is technically unsound. However, no Python hashing implementation that I'm aware of holds the GIL while it calls into native code. I need to get some expert opinions on this.

Features

Currently only basic hashing is supported, with the default 32-byte output size. Missing BLAKE3 features should be easy to add, though I'm not sure exactly what the API should look like. Missing features include:

  • variable-length output
  • an incremental output reader
  • the keyed hashing mode
  • the key derivation mode
  • optional multi-threading

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blake3-0.1.0.tar.gz (5.0 kB view hashes)

Uploaded Source

Built Distributions

blake3-0.1.0-cp38-none-win_amd64.whl (119.8 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

blake3-0.1.0-cp38-cp38-manylinux1_x86_64.whl (148.6 kB view hashes)

Uploaded CPython 3.8

blake3-0.1.0-cp37-cp37m-manylinux1_x86_64.whl (148.6 kB view hashes)

Uploaded CPython 3.7m

blake3-0.1.0-cp36-cp36m-manylinux1_x86_64.whl (149.3 kB view hashes)

Uploaded CPython 3.6m

blake3-0.1.0-cp35-cp35m-manylinux1_x86_64.whl (149.4 kB view hashes)

Uploaded CPython 3.5m

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page