Skip to main content

Python binding for xxHash

Project description

Travis CI Build Status Appveyor Build Status Latest Version Supported Python versions License

xxhash is a Python binding for the xxHash library by Yann Collet.

Installation

$ pip install xxhash

Installing From Source

$ pip install --no-binary xxhash xxhash

Prerequisites

On Debian/Ubuntu:

$ apt-get install python-dev gcc

On CentOS/Fedora:

$ yum install python-devel gcc redhat-rpm-config

Linking to libxxhash.so

By default python-xxhash will use bundled xxHash, we can change this by specifying ENV var XXHASH_LINK_SO:

$ XXHASH_LINK_SO=1 pip install --no-binary xxhash xxhash

Usage

Module version and its backend xxHash library version can be retrieved using the module properties VERSION AND XXHASH_VERSION respectively.

>>> import xxhash
>>> xxhash.VERSION
'2.0.0'
>>> xxhash.XXHASH_VERSION
'0.8.0'

This module is hashlib-compliant, which means you can use it in the same way as hashlib.md5.

update() – update the current digest with an additional string
digest() – return the current digest value
hexdigest() – return the current digest as a string of hexadecimal digits
intdigest() – return the current digest as an integer
copy() – return a copy of the current xxhash object
reset() – reset state

md5 digest returns bytes, but the original xxh32 and xxh64 C APIs return integers. While this module is made hashlib-compliant, intdigest() is also provided to get the integer digest.

Constructors for hash algorithms provided by this module are xxh32() and xxh64().

For example, to obtain the digest of the byte string b'Nobody inspects the spammish repetition':

>>> import xxhash
>>> x = xxhash.xxh32()
>>> x.update(b'Nobody inspects')
>>> x.update(b' the spammish repetition')
>>> x.digest()
b'\xe2);/'
>>> x.digest_size
4
>>> x.block_size
16

More condensed:

>>> xxhash.xxh32(b'Nobody inspects the spammish repetition').hexdigest()
'e2293b2f'
>>> xxhash.xxh32(b'Nobody inspects the spammish repetition').digest() == x.digest()
True

An optional seed (default is 0) can be used to alter the result predictably:

>>> import xxhash
>>> xxhash.xxh64('xxhash').hexdigest()
'32dd38952c4bc720'
>>> xxhash.xxh64('xxhash', seed=20141025).hexdigest()
'b559b98d844e0635'
>>> x = xxhash.xxh64(seed=20141025)
>>> x.update('xxhash')
>>> x.hexdigest()
'b559b98d844e0635'
>>> x.intdigest()
13067679811253438005

Be careful that xxh32 takes an unsigned 32-bit integer as seed, while xxh64 takes an unsigned 64-bit integer. Although unsigned integer overflow is defined behavior, it’s better not to make it happen:

>>> xxhash.xxh32('I want an unsigned 32-bit seed!', seed=0).hexdigest()
'f7a35af8'
>>> xxhash.xxh32('I want an unsigned 32-bit seed!', seed=2**32).hexdigest()
'f7a35af8'
>>> xxhash.xxh32('I want an unsigned 32-bit seed!', seed=1).hexdigest()
'd8d4b4ba'
>>> xxhash.xxh32('I want an unsigned 32-bit seed!', seed=2**32+1).hexdigest()
'd8d4b4ba'
>>>
>>> xxhash.xxh64('I want an unsigned 64-bit seed!', seed=0).hexdigest()
'd4cb0a70a2b8c7c1'
>>> xxhash.xxh64('I want an unsigned 64-bit seed!', seed=2**64).hexdigest()
'd4cb0a70a2b8c7c1'
>>> xxhash.xxh64('I want an unsigned 64-bit seed!', seed=1).hexdigest()
'ce5087f12470d961'
>>> xxhash.xxh64('I want an unsigned 64-bit seed!', seed=2**64+1).hexdigest()
'ce5087f12470d961'

digest() returns bytes of the big-endian representation of the integer digest:

>>> import xxhash
>>> h = xxhash.xxh64()
>>> h.digest()
b'\xefF\xdb7Q\xd8\xe9\x99'
>>> h.intdigest().to_bytes(8, 'big')
b'\xefF\xdb7Q\xd8\xe9\x99'
>>> h.hexdigest()
'ef46db3751d8e999'
>>> format(h.intdigest(), '016x')
'ef46db3751d8e999'
>>> h.intdigest()
17241709254077376921
>>> int(h.hexdigest(), 16)
17241709254077376921

Besides xxh32/xxh64 mentioned above, oneshot functions are also provided, so we can avoid allocating XXH32/64 state on heap:

xxh32_digest(bytes, seed=0)
xxh32_intdigest(bytes, seed=0)
xxh32_hexdigest(bytes, seed=0)
xxh64_digest(bytes, seed=0)
xxh64_intdigest(bytes, seed=0)
xxh64_hexdigest(bytes, seed=0)
>>> import xxhash
>>> xxhash.xxh64('a').digest() == xxhash.xxh64_digest('a')
True
>>> xxhash.xxh64('a').intdigest() == xxhash.xxh64_intdigest('a')
True
>>> xxhash.xxh64('a').hexdigest() == xxhash.xxh64_hexdigest('a')
True
>>> xxhash.xxh64_hexdigest('xxhash', seed=20141025)
'b559b98d844e0635'
>>> xxhash.xxh64_intdigest('xxhash', seed=20141025)
13067679811253438005L
>>> xxhash.xxh64_digest('xxhash', seed=20141025)
'\xb5Y\xb9\x8d\x84N\x065'
In [1]: import xxhash

In [2]: %timeit xxhash.xxh64_hexdigest('xxhash')
268 ns ± 24.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [3]: %timeit xxhash.xxh64('xxhash').hexdigest()
416 ns ± 17.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

XXH3 hashes are available since v2.0.0 (xxHash v0.8.0), they are:

Streaming classes:

xxh3_64
xxh3_128

Oneshot functions:

xxh3_64_digest(bytes, seed=0)
xxh3_64_intdigest(bytes, seed=0)
xxh3_64_hexdigest(bytes, seed=0)
xxh3_128_digest(bytes, seed=0)
xxh3_128_intdigest(bytes, seed=0)
xxh3_128_hexdigest(bytes, seed=0)

And aliases:

xxh128 = xxh3_128
xxh128_digest = xxh3_128_digest
xxh128_intdigest = xxh3_128_intdigest
xxh128_hexdigest = xxh3_128_hexdigest

Caveats

SEED OVERFLOW

xxh32 takes an unsigned 32-bit integer as seed, and xxh64 takes an unsigned 64-bit integer as seed. Make sure that the seed is greater than or equal to 0.

ENDIANNESS

As of python-xxhash 0.3.0, digest() returns bytes of the big-endian representation of the integer digest. It used to be little-endian.

DONT USE XXHASH IN HMAC

Though you can use xxhash as an HMAC hash function, but it’s highly recommended not to.

xxhash is NOT a cryptographic hash function, it is a non-cryptographic hash algorithm aimed at speed and quality. Do not put xxhash in any position where cryptographic hash functions are required.

CHANGELOG

v2.0.0 2020-08-03

  • Require xxHash version >= v0.8.0

  • Upgrade xxHash to v0.8.0

  • XXH3 hashes: xxh3_64, xxh3_128, and their oneshot functions

v1.4.4 2020-06-20

  • Upgrade xxHash to v0.7.3

  • Stop using PEP393 deprecated APIs

  • Use XXH(32|64)_canonicalFromHash to replace u2bytes and ull2bytes

v1.4.3 2019-11-12

  • Upgrade xxHash to v0.7.2

  • Python 3.8 wheels

v1.4.2 2019-10-13

  • Fixed: setup.py fails when reading README.rst and the default encoding is not UTF-8

v1.4.1 2019-08-27

  • Fixed: xxh3.h in missing from source tarball

v1.4.0 2019-08-25

  • Upgrade xxHash to v0.7.1

v1.3.0 2018-10-21

v1.2.0 2018-07-13

  • Add oneshot functions xxh{32,64}_{,int,hex}digest

v1.1.0 2018-07-05

  • Allow input larger than 2GB

  • Release the GIL on sufficiently large input

  • Drop support for Python 3.2

v1.0.1 2017-03-02

  • Free state actively, instead of delegating it to ffi.gc

v1.0.0 2017-02-10

  • Fixed copy() segfault

  • Added CFFI variant

v0.6.3 2017-02-10

  • Fixed copy() segfault

v0.6.2 2017-02-10

  • Upgrade xxHash to v0.6.2

v0.6.1 2016-06-26

  • Upgrade xxHash to v0.6.1

v0.5.0 2016-03-02

  • Upgrade xxHash to v0.5.0

v0.4.3 2015-08-21

  • Upgrade xxHash to r42

v0.4.1 2015-08-16

  • Upgrade xxHash to r41

v0.4.0 2015-08-05

  • Added method reset

  • Upgrade xxHash to r40

v0.3.2 2015-01-27

  • Fixed some typos in docstrings

v0.3.1 2015-01-24

  • Upgrade xxHash to r39

v0.3.0 2014-11-11

  • Change digest() from little-endian representation to big-endian representation of the integer digest. This change breaks compatibility (digest() results are different).

v0.2.0 2014-10-25

  • Make this package hashlib-compliant

v0.1.3 2014-10-23

  • Update xxHash to r37

v0.1.2 2014-10-19

  • Improve: Check XXHnn_init() return value.

  • Update xxHash to r36

v0.1.1 2014-08-07

  • Improve: Can now be built with Visual C++ Compiler.

v0.1.0 2014-08-05

  • New: XXH32 and XXH64 type, which support partially update.

  • Fix: build under Python 3.4

v0.0.2 2014-08-03

  • NEW: Support Python 3

v0.0.1 2014-07-30

  • NEW: xxh32 and xxh64

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xxhash-2.0.0.tar.gz (64.1 kB view hashes)

Uploaded Source

Built Distributions

xxhash-2.0.0-pp36-pypy36_pp73-win32.whl (35.5 kB view hashes)

Uploaded PyPy Windows x86

xxhash-2.0.0-pp36-pypy36_pp73-manylinux2010_x86_64.whl (30.4 kB view hashes)

Uploaded PyPy manylinux: glibc 2.12+ x86-64

xxhash-2.0.0-pp27-pypy_73-win32.whl (35.4 kB view hashes)

Uploaded PyPy Windows x86

xxhash-2.0.0-pp27-pypy_73-manylinux2010_x86_64.whl (30.3 kB view hashes)

Uploaded PyPy manylinux: glibc 2.12+ x86-64

xxhash-2.0.0-cp38-cp38-win_amd64.whl (35.6 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

xxhash-2.0.0-cp38-cp38-win32.whl (36.6 kB view hashes)

Uploaded CPython 3.8 Windows x86

xxhash-2.0.0-cp38-cp38-manylinux2014_aarch64.whl (241.5 kB view hashes)

Uploaded CPython 3.8

xxhash-2.0.0-cp38-cp38-manylinux2010_x86_64.whl (243.9 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

xxhash-2.0.0-cp38-cp38-manylinux2010_i686.whl (214.6 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ i686

xxhash-2.0.0-cp38-cp38-manylinux1_x86_64.whl (243.9 kB view hashes)

Uploaded CPython 3.8

xxhash-2.0.0-cp38-cp38-manylinux1_i686.whl (214.6 kB view hashes)

Uploaded CPython 3.8

xxhash-2.0.0-cp38-cp38-macosx_10_9_x86_64.whl (31.5 kB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

xxhash-2.0.0-cp37-cp37m-win_amd64.whl (35.6 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

xxhash-2.0.0-cp37-cp37m-win32.whl (36.5 kB view hashes)

Uploaded CPython 3.7m Windows x86

xxhash-2.0.0-cp37-cp37m-manylinux2014_aarch64.whl (241.3 kB view hashes)

Uploaded CPython 3.7m

xxhash-2.0.0-cp37-cp37m-manylinux2010_x86_64.whl (243.8 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

xxhash-2.0.0-cp37-cp37m-manylinux2010_i686.whl (214.3 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ i686

xxhash-2.0.0-cp37-cp37m-manylinux1_x86_64.whl (243.8 kB view hashes)

Uploaded CPython 3.7m

xxhash-2.0.0-cp37-cp37m-manylinux1_i686.whl (214.3 kB view hashes)

Uploaded CPython 3.7m

xxhash-2.0.0-cp37-cp37m-macosx_10_6_intel.whl (68.1 kB view hashes)

Uploaded CPython 3.7m macOS 10.6+ intel

xxhash-2.0.0-cp36-cp36m-win_amd64.whl (35.6 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

xxhash-2.0.0-cp36-cp36m-win32.whl (36.5 kB view hashes)

Uploaded CPython 3.6m Windows x86

xxhash-2.0.0-cp36-cp36m-manylinux2014_aarch64.whl (240.4 kB view hashes)

Uploaded CPython 3.6m

xxhash-2.0.0-cp36-cp36m-manylinux2010_x86_64.whl (242.9 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

xxhash-2.0.0-cp36-cp36m-manylinux2010_i686.whl (213.4 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ i686

xxhash-2.0.0-cp36-cp36m-manylinux1_x86_64.whl (242.9 kB view hashes)

Uploaded CPython 3.6m

xxhash-2.0.0-cp36-cp36m-manylinux1_i686.whl (213.4 kB view hashes)

Uploaded CPython 3.6m

xxhash-2.0.0-cp36-cp36m-macosx_10_6_intel.whl (68.1 kB view hashes)

Uploaded CPython 3.6m macOS 10.6+ intel

xxhash-2.0.0-cp35-cp35m-win_amd64.whl (35.6 kB view hashes)

Uploaded CPython 3.5m Windows x86-64

xxhash-2.0.0-cp35-cp35m-win32.whl (36.5 kB view hashes)

Uploaded CPython 3.5m Windows x86

xxhash-2.0.0-cp35-cp35m-manylinux2014_aarch64.whl (240.1 kB view hashes)

Uploaded CPython 3.5m

xxhash-2.0.0-cp35-cp35m-manylinux2010_x86_64.whl (242.7 kB view hashes)

Uploaded CPython 3.5m manylinux: glibc 2.12+ x86-64

xxhash-2.0.0-cp35-cp35m-manylinux2010_i686.whl (213.2 kB view hashes)

Uploaded CPython 3.5m manylinux: glibc 2.12+ i686

xxhash-2.0.0-cp35-cp35m-manylinux1_x86_64.whl (242.7 kB view hashes)

Uploaded CPython 3.5m

xxhash-2.0.0-cp35-cp35m-manylinux1_i686.whl (213.2 kB view hashes)

Uploaded CPython 3.5m

xxhash-2.0.0-cp35-cp35m-macosx_10_6_intel.whl (68.1 kB view hashes)

Uploaded CPython 3.5m macOS 10.6+ intel

xxhash-2.0.0-cp27-cp27mu-manylinux2010_x86_64.whl (241.2 kB view hashes)

Uploaded CPython 2.7mu manylinux: glibc 2.12+ x86-64

xxhash-2.0.0-cp27-cp27mu-manylinux2010_i686.whl (211.8 kB view hashes)

Uploaded CPython 2.7mu manylinux: glibc 2.12+ i686

xxhash-2.0.0-cp27-cp27mu-manylinux1_x86_64.whl (241.2 kB view hashes)

Uploaded CPython 2.7mu

xxhash-2.0.0-cp27-cp27mu-manylinux1_i686.whl (211.8 kB view hashes)

Uploaded CPython 2.7mu

xxhash-2.0.0-cp27-cp27m-win_amd64.whl (32.7 kB view hashes)

Uploaded CPython 2.7m Windows x86-64

xxhash-2.0.0-cp27-cp27m-win32.whl (38.8 kB view hashes)

Uploaded CPython 2.7m Windows x86

xxhash-2.0.0-cp27-cp27m-manylinux2010_x86_64.whl (241.2 kB view hashes)

Uploaded CPython 2.7m manylinux: glibc 2.12+ x86-64

xxhash-2.0.0-cp27-cp27m-manylinux2010_i686.whl (211.8 kB view hashes)

Uploaded CPython 2.7m manylinux: glibc 2.12+ i686

xxhash-2.0.0-cp27-cp27m-manylinux1_x86_64.whl (241.2 kB view hashes)

Uploaded CPython 2.7m

xxhash-2.0.0-cp27-cp27m-manylinux1_i686.whl (211.8 kB view hashes)

Uploaded CPython 2.7m

xxhash-2.0.0-cp27-cp27m-macosx_10_6_intel.whl (68.0 kB view hashes)

Uploaded CPython 2.7m macOS 10.6+ intel

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page