Python Non-cryptographic Hash Library
Project description
Introduction
pyhash
is a python non-cryptographic hash library.
It provides several common hash algorithms with C/C++ implementation for performance and compatibility.
>>> import pyhash
>>> hasher = pyhash.fnv1_32()
>>> hasher('hello world')
2805756500L
>>> hasher('hello', ' ', 'world')
2805756500L
>>> hasher('world', seed=hasher('hello '))
2805756500L
It also can be used to generate fingerprints without seed.
>>> import pyhash
>>> fp = pyhash.farm_fingerprint_64()
>>> fp('hello')
>>> 13009744463427800296L
>>> fp('hello', 'world')
>>> [13009744463427800296L, 16436542438370751598L]
Notes
hasher('hello', ' ', 'world')
is a syntax sugar for hasher('world', seed=hasher(' ', seed=hasher('hello')))
, and may not equals to hasher('hello world')
, because some hash algorithms use different hash
and seed
size.
For example, metro
hash always use 32bit seed for 64/128 bit hash value.
>>> import pyhash
>>> hasher = pyhash.metro_64()
>>> hasher('hello world')
>>> 5622782129197849471L
>>> hasher('hello', ' ', 'world')
>>> 16402988188088019159L
>>> hasher('world', seed=hasher(' ', seed=hasher('hello')))
>>> 16402988188088019159L
Installation
$ pip install pyhash
Notes
If pip
install failed with similar errors, #27
/usr/lib/gcc/x86_64-linux-gnu/6/include/smmintrin.h:846:1: error: inlining failed in call to always_inline 'long long unsigned int _mm_crc32_u64(long long unsigned int, long long unsigned int)': target specific option mismatch
_mm_crc32_u64 (unsigned long long __C, unsigned long long __V)
^~~~~~~~~~~~~
src/smhasher/metrohash64crc.cpp:52:34: note: called from here
v[0] ^= _mm_crc32_u64(v[0], read_u64(ptr)); ptr += 8;
~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
Please upgrade pip
and setuptools
to latest version and try again
$ pip install --upgrade pip setuptools
Notes
If pip
install failed on MacOS with similar errors #28
creating build/temp.macosx-10.6-intel-3.6
...
/usr/bin/clang -fno-strict-aliasing -Wsign-compare -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -arch i386 -arch x86_64 -g -c src/smhasher/metrohash64crc.cpp -o build/temp.macosx-10.6-intel-3.6/src/smhasher/metrohash64crc.o -msse4.2 -maes -mavx -mavx2
src/smhasher/metrohash64crc.cpp:52:21: error: use of undeclared identifier '_mm_crc32_u64'
v[0] ^= _mm_crc32_u64(v[0], read_u64(ptr)); ptr += 8;
^
You may try to
$ CFLAGS="-mmacosx-version-min=10.13" pip install pyhash
Notes
pyhash
only support pypy
v6.0 or newer, please download and install the latest pypy
.
Algorithms
pyhash supports the following hash algorithms
- FNV (Fowler-Noll-Vo) hash
- fnv1_32
- fnv1a_32
- fnv1_64
- fnv1a_64
- MurmurHash
- murmur1_32
- murmur1_aligned_32
- murmur2_32
- murmur2a_32
- murmur2_aligned_32
- murmur2_neutral_32
- murmur2_x64_64a
- murmur2_x86_64b
- murmur3_32
- murmur3_x86_128
- murmur3_x64_128
- lookup3
- lookup3
- lookup3_little
- lookup3_big
- SuperFastHash
- super_fast_hash
- City Hash
_ city_32
- city_64
- city_128
- city_crc_128
- city_fingerprint_256
- Spooky Hash
- spooky_32
- spooky_64
- spooky_128
- FarmHash
- farm_32
- farm_64
- farm_128
- farm_fingerprint_32
- farm_fingerprint_64
- farm_fingerprint_128
- MetroHash
- metro_64
- metro_128
- metro_crc_64
- metro_crc_128
- MumHash
- mum_64
- T1Ha
- t1ha2 (64-bit little-endian)
- t1ha2_128 (128-bit little-endian)
- t1ha1 (64-bit native-endian)
- t1ha1_le (64-bit little-endian)
- t1ha1_be (64-bit big-endian)
- t1ha0 (64-bit, choice fastest function in runtime.)
t1_32t1_32_bet1_64t1_64_be
- XXHash
- xx_32
- xx_64
String and Bytes literals
Python has two types can be used to present string literals, the hash values of the two types are definitely different.
- For Python 2.x String literals,
str
will be used by default,unicode
can be used with theu
prefix. - For Python 3.x String and Bytes literals,
unicode
will be used by default,bytes
can be used with theb
prefix.
For example,
$ python2
Python 2.7.15 (default, Jun 17 2018, 12:46:58)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyhash
>>> hasher = pyhash.murmur3_32()
>>> hasher('foo')
4138058784L
>>> hasher(u'foo')
2085578581L
>>> hasher(b'foo')
4138058784L
$ python3
Python 3.7.0 (default, Jun 29 2018, 20:13:13)
[Clang 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyhash
>>> hasher = pyhash.murmur3_32()
>>> hasher('foo')
2085578581
>>> hasher(u'foo')
2085578581
>>> hasher(b'foo')
4138058784
You can also import unicode_literals to use unicode literals in Python 2.x
from __future__ import unicode_literals
In general, it is more compelling to use unicode_literals when back-porting new or existing Python 3 code to Python 2/3 than when porting existing Python 2 code to 2/3. In the latter case, explicitly marking up all unicode string literals with u'' prefixes would help to avoid unintentionally changing the existing Python 2 API. However, if changing the existing Python 2 API is not a concern, using unicode_literals may speed up the porting process.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file pyhash-0.9.3.tar.gz
.
File metadata
- Download URL: pyhash-0.9.3.tar.gz
- Upload date:
- Size: 602.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cff5c81d613163fc59d623d4546d9be55b46ecd0e573b59057b1bb112a497763 |
|
MD5 | bd3028e30a35b2337a5184fac0ebe4f0 |
|
BLAKE2b-256 | f0bf4db9bed05d10824a17697f65063de19892ca2171a31a9c6854f9bbf55c02 |
File details
Details for the file pyhash-0.9.3-pp370-pypy3_70-macosx_10_14_x86_64.whl
.
File metadata
- Download URL: pyhash-0.9.3-pp370-pypy3_70-macosx_10_14_x86_64.whl
- Upload date:
- Size: 207.6 kB
- Tags: PyPy, macOS 10.14+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | def02321636dbd2a437affc080d0f91861bf88ee0a70f9777525f93e18aca3c4 |
|
MD5 | cdf960ffdbd6b5c9029938c0c88d0941 |
|
BLAKE2b-256 | 92ae67e99d6493eeff760e63257003baaf66d5540ee6bb30eda38d159a333e74 |
File details
Details for the file pyhash-0.9.3-pp270-pypy_41-macosx_10_14_x86_64.whl
.
File metadata
- Download URL: pyhash-0.9.3-pp270-pypy_41-macosx_10_14_x86_64.whl
- Upload date:
- Size: 436.1 kB
- Tags: PyPy, macOS 10.14+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6808fdc840f458885f3970cf23f7797332cd653a75b85fd4e095fdf478193f5 |
|
MD5 | 0e08427d5e9a64a8262904911d063b50 |
|
BLAKE2b-256 | 5b179c7dbe4b5319b7164c832ec43720b627fa10678069adef9d2ca67f4b0a7b |
File details
Details for the file pyhash-0.9.3-cp37-cp37m-macosx_10_14_x86_64.whl
.
File metadata
- Download URL: pyhash-0.9.3-cp37-cp37m-macosx_10_14_x86_64.whl
- Upload date:
- Size: 232.1 kB
- Tags: CPython 3.7m, macOS 10.14+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 898386319cdaf79e05d6811beef183cc12d59afa737f997a2c98c2ed0dc9ce5f |
|
MD5 | 64b0a201f301de6a0d95d050f862d021 |
|
BLAKE2b-256 | 7cc3140bfe0015330af1624a3297d00b74913930d74259924e422d90fb372622 |
File details
Details for the file pyhash-0.9.3-cp27-cp27m-macosx_10_14_x86_64.whl
.
File metadata
- Download URL: pyhash-0.9.3-cp27-cp27m-macosx_10_14_x86_64.whl
- Upload date:
- Size: 234.4 kB
- Tags: CPython 2.7m, macOS 10.14+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 885ae39ebec2dcb61fdf2239cd12513d26ebf7edb2ef4e337405a268ba90b33e |
|
MD5 | f17888d903cbe7e6bdbfc5aced6ed9ba |
|
BLAKE2b-256 | 517e7cb9c74bc2ea91fdb35cc646e0dab32adfeb112b0409aba6c41ab94f7a64 |