Skip to main content

Practical Perfect Hashing module

Project description

pph

pph generates a minimal order-preserving hash function for a list of keys.

Reference:

Practical perfect hashing GV Cormack, RNS Horspool, M Kaiserswerth - The Computer Journal, 1985

Table of Contents

License

This project is licensed under the Apache License, Version 2.0.

Building

This library uses the Boost library.

export LDFLAGS="$LDFLAGS -L/path/to/boost/lib"
export CPPFLAGS="$CPPFLAGS -I/path/to/boost/include"

pph uses CMake for its build system.

mkdir build
cd build
cmake ..
make

Using

The basic command line to generate a hash function from a file containing a list of strings (one per line) is:

pph -i ./file.txt -o ./file.hash

The command line to verify an existing hash function is:

pph --verify ./file.hash

The other command line options can be seen by typing:

pph --help

The default timeout for creating a hash function is 60000 milliseconds (1 minute).

If a hash function is not generated, you can try sorting the input file:

pph -i file.txt --index > file_index.txt
sort --numeric-sort --key=2 file_index.txt > file_sorted_index.txt
awk -F' ' '{print $1}'  file_sorted_index.txt > file_sorted.txt

Python

This library uses the Boost library. Install the Boost library and set LDFLAGS and CPPFLAGS before installing the Python module.

export LDFLAGS="$LDFLAGS -L/path/to/boost/lib"
export CPPFLAGS="$CPPFLAGS -I/path/to/boost/include"

Install the module.

pip3 install pph

Import the module.

from pph import PphHashTable, PphRandomNumber, PphKeyFunctions

See the tests for how to generate a hash function using the Python interface.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pph-0.2.0.tar.gz (156.4 kB view details)

Uploaded Source

File details

Details for the file pph-0.2.0.tar.gz.

File metadata

  • Download URL: pph-0.2.0.tar.gz
  • Upload date:
  • Size: 156.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for pph-0.2.0.tar.gz
Algorithm Hash digest
SHA256 107a3cd5bcd4b84d9f44cf4bc71b733cd58a0cd231435ab3177c069cad1e9189
MD5 28f29f19666793430080d02fab84a40c
BLAKE2b-256 eabbd429257b1804640026dc431895e8fadda414b4b26fda90fc5afe992500d8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page