Skip to main content

Python Non-cryptographic Hash Library

Project description

Introduction pypi Travis CI Status codecov

pyhash is a python non-cryptographic hash library.

It provides several common hash algorithms with C/C++ implementation for performance and compatibility.

>>> import pyhash
>>> hasher = pyhash.fnv1_32()

>>> hasher('hello world')
2805756500L

>>> hasher('hello', ' ', 'world')
2805756500L

>>> hasher('world', seed=hasher('hello '))
2805756500L

It also can be used to generate fingerprints without seed.

>>> import pyhash
>>> fp = pyhash.farm_fingerprint_64()

>>> fp('hello')
>>> 13009744463427800296L

>>> fp('hello', 'world')
>>> [13009744463427800296L, 16436542438370751598L]

Notes

hasher('hello', ' ', 'world') is a syntax sugar for hasher('world', seed=hasher(' ', seed=hasher('hello'))), and may not equals to hasher('hello world'), because some hash algorithms use different hash and seed size.

For example, metro hash always use 32bit seed for 64/128 bit hash value.

>>> import pyhash
>>> hasher = pyhash.metro_64()

>>> hasher('hello world')
>>> 5622782129197849471L

>>> hasher('hello', ' ', 'world')
>>> 16402988188088019159L

>>> hasher('world', seed=hasher(' ', seed=hasher('hello')))
>>> 16402988188088019159L

Installation

$ pip install pyhash

Notes

If pip install failed with similar errors, #27

/usr/lib/gcc/x86_64-linux-gnu/6/include/smmintrin.h:846:1: error: inlining failed in call to always_inline 'long long unsigned int _mm_crc32_u64(long long unsigned int, long long unsigned int)': target specific option mismatch
 _mm_crc32_u64 (unsigned long long __C, unsigned long long __V)
 ^~~~~~~~~~~~~
src/smhasher/metrohash64crc.cpp:52:34: note: called from here
             v[0] ^= _mm_crc32_u64(v[0], read_u64(ptr)); ptr += 8;
                     ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~

Please upgrade pip and setuptools to latest version and try again

$ pip install --upgrade pip setuptools

Notes

If pip install failed on MacOS with similar errors #28

   creating build/temp.macosx-10.6-intel-3.6
   ...
   /usr/bin/clang -fno-strict-aliasing -Wsign-compare -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -arch i386 -arch x86_64 -g -c src/smhasher/metrohash64crc.cpp -o build/temp.macosx-10.6-intel-3.6/src/smhasher/metrohash64crc.o -msse4.2 -maes -mavx -mavx2
    src/smhasher/metrohash64crc.cpp:52:21: error: use of undeclared identifier '_mm_crc32_u64'
                v[0] ^= _mm_crc32_u64(v[0], read_u64(ptr)); ptr += 8;
                        ^

You may try to

$ CFLAGS="-mmacosx-version-min=10.13" pip install pyhash

Notes

pyhash only support pypy v6.0 or newer, please download and install the latest pypy.

Algorithms

pyhash supports the following hash algorithms

  • FNV (Fowler-Noll-Vo) hash
    • fnv1_32
    • fnv1a_32
    • fnv1_64
    • fnv1a_64
  • MurmurHash
    • murmur1_32
    • murmur1_aligned_32
    • murmur2_32
    • murmur2a_32
    • murmur2_aligned_32
    • murmur2_neutral_32
    • murmur2_x64_64a
    • murmur2_x86_64b
    • murmur3_32
    • murmur3_x86_128
    • murmur3_x64_128
  • lookup3
    • lookup3
    • lookup3_little
    • lookup3_big
  • SuperFastHash
    • super_fast_hash
  • City Hash _ city_32
    • city_64
    • city_128
    • city_crc_128
    • city_fingerprint_256
  • Spooky Hash
    • spooky_32
    • spooky_64
    • spooky_128
  • FarmHash
    • farm_32
    • farm_64
    • farm_128
    • farm_fingerprint_32
    • farm_fingerprint_64
    • farm_fingerprint_128
  • MetroHash
    • metro_64
    • metro_128
    • metro_crc_64
    • metro_crc_128
  • MumHash
    • mum_64
  • T1Ha
    • t1ha2 (64-bit little-endian)
    • t1ha2_128 (128-bit little-endian)
    • t1ha1 (64-bit native-endian)
    • t1ha1_le (64-bit little-endian)
    • t1ha1_be (64-bit big-endian)
    • t1ha0 (64-bit, choice fastest function in runtime.)
    • t1_32
    • t1_32_be
    • t1_64
    • t1_64_be
  • XXHash
    • xx_32
    • xx_64

String and Bytes literals

Python has two types can be used to present string literals, the hash values of the two types are definitely different.

  • For Python 2.x String literals, str will be used by default, unicode can be used with the u prefix.
  • For Python 3.x String and Bytes literals, unicode will be used by default, bytes can be used with the b prefix.

For example,

$ python2
Python 2.7.15 (default, Jun 17 2018, 12:46:58)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyhash
>>> hasher = pyhash.murmur3_32()
>>> hasher('foo')
4138058784L
>>> hasher(u'foo')
2085578581L
>>> hasher(b'foo')
4138058784L
$ python3
Python 3.7.0 (default, Jun 29 2018, 20:13:13)
[Clang 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyhash
>>> hasher = pyhash.murmur3_32()
>>> hasher('foo')
2085578581
>>> hasher(u'foo')
2085578581
>>> hasher(b'foo')
4138058784

You can also import unicode_literals to use unicode literals in Python 2.x

from __future__ import unicode_literals

In general, it is more compelling to use unicode_literals when back-porting new or existing Python 3 code to Python 2/3 than when porting existing Python 2 code to 2/3. In the latter case, explicitly marking up all unicode string literals with u'' prefixes would help to avoid unintentionally changing the existing Python 2 API. However, if changing the existing Python 2 API is not a concern, using unicode_literals may speed up the porting process.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhash-0.9.3.tar.gz (602.3 kB view details)

Uploaded Source

Built Distributions

pyhash-0.9.3-pp370-pypy3_70-macosx_10_14_x86_64.whl (207.6 kB view details)

Uploaded PyPy macOS 10.14+ x86-64

pyhash-0.9.3-pp270-pypy_41-macosx_10_14_x86_64.whl (436.1 kB view details)

Uploaded PyPy macOS 10.14+ x86-64

pyhash-0.9.3-cp37-cp37m-macosx_10_14_x86_64.whl (232.1 kB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

pyhash-0.9.3-cp27-cp27m-macosx_10_14_x86_64.whl (234.4 kB view details)

Uploaded CPython 2.7m macOS 10.14+ x86-64

File details

Details for the file pyhash-0.9.3.tar.gz.

File metadata

  • Download URL: pyhash-0.9.3.tar.gz
  • Upload date:
  • Size: 602.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for pyhash-0.9.3.tar.gz
Algorithm Hash digest
SHA256 cff5c81d613163fc59d623d4546d9be55b46ecd0e573b59057b1bb112a497763
MD5 bd3028e30a35b2337a5184fac0ebe4f0
BLAKE2b-256 f0bf4db9bed05d10824a17697f65063de19892ca2171a31a9c6854f9bbf55c02

See more details on using hashes here.

File details

Details for the file pyhash-0.9.3-pp370-pypy3_70-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pyhash-0.9.3-pp370-pypy3_70-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 207.6 kB
  • Tags: PyPy, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for pyhash-0.9.3-pp370-pypy3_70-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 def02321636dbd2a437affc080d0f91861bf88ee0a70f9777525f93e18aca3c4
MD5 cdf960ffdbd6b5c9029938c0c88d0941
BLAKE2b-256 92ae67e99d6493eeff760e63257003baaf66d5540ee6bb30eda38d159a333e74

See more details on using hashes here.

File details

Details for the file pyhash-0.9.3-pp270-pypy_41-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pyhash-0.9.3-pp270-pypy_41-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 436.1 kB
  • Tags: PyPy, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for pyhash-0.9.3-pp270-pypy_41-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f6808fdc840f458885f3970cf23f7797332cd653a75b85fd4e095fdf478193f5
MD5 0e08427d5e9a64a8262904911d063b50
BLAKE2b-256 5b179c7dbe4b5319b7164c832ec43720b627fa10678069adef9d2ca67f4b0a7b

See more details on using hashes here.

File details

Details for the file pyhash-0.9.3-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pyhash-0.9.3-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 232.1 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for pyhash-0.9.3-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 898386319cdaf79e05d6811beef183cc12d59afa737f997a2c98c2ed0dc9ce5f
MD5 64b0a201f301de6a0d95d050f862d021
BLAKE2b-256 7cc3140bfe0015330af1624a3297d00b74913930d74259924e422d90fb372622

See more details on using hashes here.

File details

Details for the file pyhash-0.9.3-cp27-cp27m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pyhash-0.9.3-cp27-cp27m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 234.4 kB
  • Tags: CPython 2.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for pyhash-0.9.3-cp27-cp27m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 885ae39ebec2dcb61fdf2239cd12513d26ebf7edb2ef4e337405a268ba90b33e
MD5 f17888d903cbe7e6bdbfc5aced6ed9ba
BLAKE2b-256 517e7cb9c74bc2ea91fdb35cc646e0dab32adfeb112b0409aba6c41ab94f7a64

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page