Python Non-cryptographic Hash Library
Project description
Introduction
pyhash
is a python non-cryptographic hash library.
It provides several common hash algorithms with C/C++ implementation for performance and compatibility.
>>> import pyhash
>>> hasher = pyhash.fnv1_32()
>>> hasher('hello world')
2805756500L
>>> hasher('hello', ' ', 'world')
2805756500L
>>> hasher('world', seed=hasher('hello '))
2805756500L
It also can be used to generate fingerprints without seed.
>>> import pyhash
>>> fp = pyhash.farm_fingerprint_64()
>>> fp('hello')
>>> 13009744463427800296L
>>> fp('hello', 'world')
>>> [13009744463427800296L, 16436542438370751598L]
Notes
hasher('hello', ' ', 'world')
is a syntax sugar for hasher('world', seed=hasher(' ', seed=hasher('hello')))
, and may not equals to hasher('hello world')
, because some hash algorithms use different hash
and seed
size.
For example, metro
hash always use 32bit seed for 64/128 bit hash value.
>>> import pyhash
>>> hasher = pyhash.metro_64()
>>> hasher('hello world')
>>> 5622782129197849471L
>>> hasher('hello', ' ', 'world')
>>> 16402988188088019159L
>>> hasher('world', seed=hasher(' ', seed=hasher('hello')))
>>> 16402988188088019159L
Installation
$ pip install pyhash
Notes pyhash
only support pypy
v6.0 or newer, please download and install the latest pypy
.
Algorithms
pyhash supports the following hash algorithms
- FNV (Fowler-Noll-Vo) hash
- fnv1_32
- fnv1a_32
- fnv1_64
- fnv1a_64
- MurmurHash
- murmur1_32
- murmur1_aligned_32
- murmur2_32
- murmur2a_32
- murmur2_aligned_32
- murmur2_neutral_32
- murmur2_x64_64a
- murmur2_x86_64b
- murmur3_32
- murmur3_x86_128
- murmur3_x64_128
- lookup3
- lookup3
- lookup3_little
- lookup3_big
- SuperFastHash
- super_fast_hash
- City Hash
_ city_32
- city_64
- city_128
- city_crc_128
- city_fingerprint_256
- Spooky Hash
- spooky_32
- spooky_64
- spooky_128
- FarmHash
- farm_32
- farm_64
- farm_128
- farm_fingerprint_32
- farm_fingerprint_64
- farm_fingerprint_128
- MetroHash
- metro_64
- metro_128
- metro_crc_64
- metro_crc_128
- MumHash
- mum_64
- T1Ha
- t1ha2 (64-bit little-endian)
- t1ha2_128 (128-bit little-endian)
- t1ha1 (64-bit native-endian)
- t1ha1_le (64-bit little-endian)
- t1ha1_be (64-bit big-endian)
- t1ha0 (64-bit, choice fastest function in runtime.)
t1_32t1_32_bet1_64t1_64_be
- XXHash
- xx_32
- xx_64
String and Bytes literals
Python has two types can be used to present string literals, the hash values of the two types are definitely different.
- For Python 2.x String literals,
str
will be used by default,unicode
can be used with theu
prefix. - For Python 3.x String and Bytes literals,
unicode
will be used by default,bytes
can be used with theb
prefix.
For example,
$ python2
Python 2.7.15 (default, Jun 17 2018, 12:46:58)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyhash
>>> hasher = pyhash.murmur3_32()
>>> hasher('foo')
4138058784L
>>> hasher(u'foo')
2085578581L
>>> hasher(b'foo')
4138058784L
$ python3
Python 3.7.0 (default, Jun 29 2018, 20:13:13)
[Clang 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyhash
>>> hasher = pyhash.murmur3_32()
>>> hasher('foo')
2085578581
>>> hasher(u'foo')
2085578581
>>> hasher(b'foo')
4138058784
You can also import unicode_literals to use unicode literals in Python 2.x
from __future__ import unicode_literals
In general, it is more compelling to use unicode_literals when back-porting new or existing Python 3 code to Python 2/3 than when porting existing Python 2 code to 2/3. In the latter case, explicitly marking up all unicode string literals with u'' prefixes would help to avoid unintentionally changing the existing Python 2 API. However, if changing the existing Python 2 API is not a concern, using unicode_literals may speed up the porting process.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for pyhash-0.9.1-pp360-pypy3_60-macosx_10_13_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b07d60fad06603cfcc65afbb07624905f2c57abeb93857b2a7276c8a34e81e9 |
|
MD5 | 2ca8283329a493950d52f0e79f472211 |
|
BLAKE2b-256 | f475bc6905eb72842dc30c59360e16ac7f6195b568e1c8161bce7ce8f6f43c52 |
Hashes for pyhash-0.9.1-pp260-pypy_41-macosx_10_13_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d895b630e59c1e48363d6d494e30ddc08466156f9f7cee8c0195e7db4bea412 |
|
MD5 | ab6af89cf681bd7c230651de00a2ba1f |
|
BLAKE2b-256 | 544e50a48c2e7025de13d5801adaacd387799506453cfde4397101452ff5cd00 |
Hashes for pyhash-0.9.1-cp37-cp37m-macosx_10_13_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1112816478addf69fbaf9cfcb70d8b12fa0165b588b8c38cad5561a3f7d8c6f9 |
|
MD5 | 70730cd74f9529ac71b8e230511cd9a9 |
|
BLAKE2b-256 | a101d3e22a898f7b6139339f69db82bf14f0c49e8a10a4240aa28c62c0ed59c1 |
Hashes for pyhash-0.9.1-cp27-cp27m-macosx_10_13_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9dbdb65a151c56c83705beeeaff8b6443281bd4a688679b698a53b8b919ccc04 |
|
MD5 | faf12f194bfd55023e06ab64679a3ab7 |
|
BLAKE2b-256 | 8025147b1eae0e019935a212a09c2361939a5d60ccbfdbc72e034eab5177e387 |