Reversible obfuscated identifier hashes.
Project description
BaseHash
========
[![Build Status](https://travis-ci.org/bnlucas/python-basehash.png?branch=master)](https://travis-ci.org/bnlucas/python-basehash)
BaseHash is a small library for creating reversible obfuscated identifier hashes
to a given base and length. The project is based on the GO library, [PseudoCrypt][pc]
by [Kevin Burns][kb]. The library is extendible to use custom alphabets and other
bases.
The library uses golden primes and the [Baillie-PSW][bp] primality test or the
`gmpy2.is_prime` (if available) for hashing to `maximum` length (`base ** length - 1`).
v3.3.0
------
A massive overhaul was done with the primality algorithms. Including support for
[gmpy2][gmp] if it available on the system for that much more of an increase.
All methods being used to check primality in `primes.py` have been optimized and
benchmarked to try to get the best possible preformance when `gmpy2.is_prime`
and `gmpy2.next_prime` are not available.
v3.0.0 vs v2.2.0 without gmpy2
------------------------------
```
--------------------------------------------------------------------------------
basehash 3.0.0 vs basehash 2.2.1 speed comparison. (without gmpy2)
testing against random 128-bit integer with BASE62 and length of 30.
comparing best 100 of 1000 loops.
--------------------------------------------------------------------------------
bh300 @ 0.011989977s
bh220 @ 0.019100001s
--------------------------------------------------------------------------------
```
v3.0.0 vs v2.2.0 with gmpy2
---------------------------
```
--------------------------------------------------------------------------------
basehash 3.0.0 vs basehash 2.2.1 speed comparison. (with gmpy2)
testing against random 128-bit integer with BASE62 and length of 30.
comparing best 100 of 1000 loops.
--------------------------------------------------------------------------------
bh300 @ 0.002969882s
bh220 @ 0.018960006s
--------------------------------------------------------------------------------
```
Install
-------
```
pip install basehash
```
Testing
-------
```
nosetests tests/
```
Encode
------
```python
import basehash
base62 = basehash.base62(8)
encoded = base62.encode(2013)
decoded = base62.decode('WT')
print encoded, decoded
```
```
WT 2013
```
Hash
----
```python
import basehash
base62 = basehash.base62(8)
hashed = base62.hash(2013)
unhashed = base62.unhash('6LhOma5b')
print hashed, unhashed
```
```
6LhOma5b 2013
```
Generating your own primes
--------------------------
The `GENERATOR` variable uses the golden ratio, `1.618033988749894848`, to get
the next highest prime of `base ** number * generator`. This can be overridden
within the base classes.
```python
import basehash
base62 = basehash.base62(generator=1.75)
```
Maximum number while hashing
----------------------------
There is a maximum number while hashing with any given base. To find out what
this number is, we use the `Base^Length - 1`.
```python
import basehash
base36 = basehash.base36(10)
print base36.maximum
```
```
4738381338321616895
```
So with the max number for `base36` at length `12` as `4738381338321616895` we
get the following:
```python
import basehash
base36 = basehash.base36(12)
hash = base36.hash(4738381338321616895)
# 'DR10828P4CZP'
hash = base36.hash(4738381338321616896)
# ValueError: Number is too large for given length. Maximum is 36^12 - 1.
```
Extending
---------
Extending is made easy with some time spent determining the next highest prime
dynamically, the fastest possible that I have been able to make it so far.
```python
import basehash
custom = basehash.base('24680ACEGIKMOQSUWYbdfhjlnprtvxz', 8)
print custom.encode(2013) # 66x
print custom.decode('66x') # 2013
print custom.hash(2013) # 8AQAQdYd
print custom.unhash('8AQAQdYd') # 2013
print custom.maximum # 787662783788549760
```
[pc]: https://github.com/KevBurnsJr/pseudocrypt
[kb]: https://github.com/KevBurnsJr
[bp]: http://en.wikipedia.org/wiki/Baillie-PSW_primality_test
[gmp]: https://gmpy2.readthedocs.org/
3.3.0 (2015-06-19)
==================
- Massive overhaul on the primes.py methods. Each method was benchmarked to
get the best optimization possible.
- Uses [gmpy2](https://gmpy2.readthedocs.org/) if available, otherwise use the
baillie_psw primality check and the next_prime methods in primes.py
- `base.hash()` no longer accepts argument `length`. One instance per hash length.
- `base.maximum()`, `base.maximum_value()`, and `base.prime()` have been
removed. To get maximum hash value, call `base.maximum`. To get the prime used
call `base.prime`
- `primes.miller_rabin` was removed as it was replaced by `primes.baillie_psw`
in v1.0.2.
2.2.1 (2015-05-28)
==================
- Implemented [six](https://bitbucket.org/gutworth/six) to allow use of `xrange`
in Python 2 and `range` in Python 3
2.2.0 (2015-05-14)
==================
- Fixed ussues with python 3.4
2.1.0 (2013-07-24)
==================
- Create custom random alphabets using `basehash.generate_alphabet(alphabet)`.
- BaseHash.base now checks `alphabet` for duplicates using `set`.
2.0.2 (2013-07-10)
==================
- base and baseN now accept a length parameter, defaulted to HASH_LENGTH so that
baseN.hash(num, length) is set globally, not locally.
2.0.0 (2013-07-07)
==================
- Moved everything to an object. Removed baseN.py files, allows for easier
configuration of `GENERATOR` and for extending to a custom alphabet.
1.0.7 (2013-07-06)
==================
- There was an issue with hashes sometimes being returned one to two charcters
shorter than `length`, causing `base.base_unhash` to not function properly. To
fix this, the hashes are right-padded with `alphabet[0]`.
- Since `0` raises an error inside `primes.invmul`, `base.base_unhash` is unable
to unhash it. To allow the start of your number sequence to be `0` instead of
`1`, if needed, hashing `base.base_hash(0, length=6)` will return
`''.rjust(length, alphabet[0])`.
1.0.6 (2013-06-29)
==================
- Fixed issues with setup.py. First time using a setup.py within a package,
first time publishing the library outside of GitHub.
1.0.5 (2013-06-28)
==================
- Added nose unittests.
1.0.4 (2013-06-28)
==================
- Added setup.py, LICENSE, HISTORY.rst, and .travis.yaml.
1.0.3 (2013-06-27)
==================
- Added a simple test for `prime < 31` to reduce calculation time.
- Fixed issue of `strong_pseudoprime(n, 3)` giving false results.
1.0.2 (2013-06-27)
==================
- Changed primality test from Miller-Rabin to Baillie-PSW. This algorithm is
significantly faster.
- Changed determination to use `sqrt(n)` or `isqrt(n)` to an improved version of
`isqrt(n)`.
- BaseHash is now PEP compliant.
1.0.1 (2013-06-25)
==================
- Changed primality test from Fermat to Miller-Rabin. Improved accuracy on false
results when it comes to pseudoprimes.
1.0.0 (2013-06-24)
==================
- Released code to GitHub repository python-basehash
https://github.com/bnlucas/python-basehash
0.0.1 (2013-06-23)
==================
- Initialization
========
[![Build Status](https://travis-ci.org/bnlucas/python-basehash.png?branch=master)](https://travis-ci.org/bnlucas/python-basehash)
BaseHash is a small library for creating reversible obfuscated identifier hashes
to a given base and length. The project is based on the GO library, [PseudoCrypt][pc]
by [Kevin Burns][kb]. The library is extendible to use custom alphabets and other
bases.
The library uses golden primes and the [Baillie-PSW][bp] primality test or the
`gmpy2.is_prime` (if available) for hashing to `maximum` length (`base ** length - 1`).
v3.3.0
------
A massive overhaul was done with the primality algorithms. Including support for
[gmpy2][gmp] if it available on the system for that much more of an increase.
All methods being used to check primality in `primes.py` have been optimized and
benchmarked to try to get the best possible preformance when `gmpy2.is_prime`
and `gmpy2.next_prime` are not available.
v3.0.0 vs v2.2.0 without gmpy2
------------------------------
```
--------------------------------------------------------------------------------
basehash 3.0.0 vs basehash 2.2.1 speed comparison. (without gmpy2)
testing against random 128-bit integer with BASE62 and length of 30.
comparing best 100 of 1000 loops.
--------------------------------------------------------------------------------
bh300 @ 0.011989977s
bh220 @ 0.019100001s
--------------------------------------------------------------------------------
```
v3.0.0 vs v2.2.0 with gmpy2
---------------------------
```
--------------------------------------------------------------------------------
basehash 3.0.0 vs basehash 2.2.1 speed comparison. (with gmpy2)
testing against random 128-bit integer with BASE62 and length of 30.
comparing best 100 of 1000 loops.
--------------------------------------------------------------------------------
bh300 @ 0.002969882s
bh220 @ 0.018960006s
--------------------------------------------------------------------------------
```
Install
-------
```
pip install basehash
```
Testing
-------
```
nosetests tests/
```
Encode
------
```python
import basehash
base62 = basehash.base62(8)
encoded = base62.encode(2013)
decoded = base62.decode('WT')
print encoded, decoded
```
```
WT 2013
```
Hash
----
```python
import basehash
base62 = basehash.base62(8)
hashed = base62.hash(2013)
unhashed = base62.unhash('6LhOma5b')
print hashed, unhashed
```
```
6LhOma5b 2013
```
Generating your own primes
--------------------------
The `GENERATOR` variable uses the golden ratio, `1.618033988749894848`, to get
the next highest prime of `base ** number * generator`. This can be overridden
within the base classes.
```python
import basehash
base62 = basehash.base62(generator=1.75)
```
Maximum number while hashing
----------------------------
There is a maximum number while hashing with any given base. To find out what
this number is, we use the `Base^Length - 1`.
```python
import basehash
base36 = basehash.base36(10)
print base36.maximum
```
```
4738381338321616895
```
So with the max number for `base36` at length `12` as `4738381338321616895` we
get the following:
```python
import basehash
base36 = basehash.base36(12)
hash = base36.hash(4738381338321616895)
# 'DR10828P4CZP'
hash = base36.hash(4738381338321616896)
# ValueError: Number is too large for given length. Maximum is 36^12 - 1.
```
Extending
---------
Extending is made easy with some time spent determining the next highest prime
dynamically, the fastest possible that I have been able to make it so far.
```python
import basehash
custom = basehash.base('24680ACEGIKMOQSUWYbdfhjlnprtvxz', 8)
print custom.encode(2013) # 66x
print custom.decode('66x') # 2013
print custom.hash(2013) # 8AQAQdYd
print custom.unhash('8AQAQdYd') # 2013
print custom.maximum # 787662783788549760
```
[pc]: https://github.com/KevBurnsJr/pseudocrypt
[kb]: https://github.com/KevBurnsJr
[bp]: http://en.wikipedia.org/wiki/Baillie-PSW_primality_test
[gmp]: https://gmpy2.readthedocs.org/
3.3.0 (2015-06-19)
==================
- Massive overhaul on the primes.py methods. Each method was benchmarked to
get the best optimization possible.
- Uses [gmpy2](https://gmpy2.readthedocs.org/) if available, otherwise use the
baillie_psw primality check and the next_prime methods in primes.py
- `base.hash()` no longer accepts argument `length`. One instance per hash length.
- `base.maximum()`, `base.maximum_value()`, and `base.prime()` have been
removed. To get maximum hash value, call `base.maximum`. To get the prime used
call `base.prime`
- `primes.miller_rabin` was removed as it was replaced by `primes.baillie_psw`
in v1.0.2.
2.2.1 (2015-05-28)
==================
- Implemented [six](https://bitbucket.org/gutworth/six) to allow use of `xrange`
in Python 2 and `range` in Python 3
2.2.0 (2015-05-14)
==================
- Fixed ussues with python 3.4
2.1.0 (2013-07-24)
==================
- Create custom random alphabets using `basehash.generate_alphabet(alphabet)`.
- BaseHash.base now checks `alphabet` for duplicates using `set`.
2.0.2 (2013-07-10)
==================
- base and baseN now accept a length parameter, defaulted to HASH_LENGTH so that
baseN.hash(num, length) is set globally, not locally.
2.0.0 (2013-07-07)
==================
- Moved everything to an object. Removed baseN.py files, allows for easier
configuration of `GENERATOR` and for extending to a custom alphabet.
1.0.7 (2013-07-06)
==================
- There was an issue with hashes sometimes being returned one to two charcters
shorter than `length`, causing `base.base_unhash` to not function properly. To
fix this, the hashes are right-padded with `alphabet[0]`.
- Since `0` raises an error inside `primes.invmul`, `base.base_unhash` is unable
to unhash it. To allow the start of your number sequence to be `0` instead of
`1`, if needed, hashing `base.base_hash(0, length=6)` will return
`''.rjust(length, alphabet[0])`.
1.0.6 (2013-06-29)
==================
- Fixed issues with setup.py. First time using a setup.py within a package,
first time publishing the library outside of GitHub.
1.0.5 (2013-06-28)
==================
- Added nose unittests.
1.0.4 (2013-06-28)
==================
- Added setup.py, LICENSE, HISTORY.rst, and .travis.yaml.
1.0.3 (2013-06-27)
==================
- Added a simple test for `prime < 31` to reduce calculation time.
- Fixed issue of `strong_pseudoprime(n, 3)` giving false results.
1.0.2 (2013-06-27)
==================
- Changed primality test from Miller-Rabin to Baillie-PSW. This algorithm is
significantly faster.
- Changed determination to use `sqrt(n)` or `isqrt(n)` to an improved version of
`isqrt(n)`.
- BaseHash is now PEP compliant.
1.0.1 (2013-06-25)
==================
- Changed primality test from Fermat to Miller-Rabin. Improved accuracy on false
results when it comes to pseudoprimes.
1.0.0 (2013-06-24)
==================
- Released code to GitHub repository python-basehash
https://github.com/bnlucas/python-basehash
0.0.1 (2013-06-23)
==================
- Initialization
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
BaseHash-3.0.0.zip
(12.4 kB
view hashes)