Pure-Python library for computing fuzzy hashes (ssdeep)
Project description
This is a pure-Python library for computing context triggered piecewise hashes (CTPH), also called fuzzy hashes, or often ssdeep after the name of a popular tool. At a very high level, fuzzy hashing is a way to determine whether two inputs are similar, rather than identical. Fuzzy hashes are widely adopted in digital forensics and malware detection.
This implementation is based on SpamSum by Dr. Andrew Tridgell.
Usage
To compute a fuzzy hash, simply use hash() function:
` >>> import ppdeep >>> h1 = ppdeep.hash('The equivalence of mass and energy translates into the well-known E = mc²') >>> h1 '3:RC0qYX4LBFA0dxEq4z2LRK+oCKI9VnXn:RvqpLB60dx8ilK+owX' >>> h2 = ppdeep.hash('The equivalence of mass and energy translates into the well-known E = MC2') >>> h2 '3:RC0qYX4LBFA0dxEq4z2LRK+oCKI99:RvqpLB60dx8ilK+oA' `
To calculate level of similarity, use compare() function which returns an integer value from 0 to 100 (full match):
` >>> ppdeep.compare(h1, h2) 29 `
Function hash_from_file() accepts a filename as argument and calculates the hash of the contents of the file:
` >>> ppdeep.hash_from_file('.bash_history') '1536:EXM36dG36x3KW732vOAcg3EP1qKlKozcK0z5G+lEPTssl/7eO7HOBF:tKlKozcWT0' `
Installation
` $ pip install ppdeep `
If you want to use the latest version of the code, you can install it from Git:
` $ git clone https://github.com/elceef/ppdeep.git $ cd ppdeep $ pip install . `
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ppdeep-20200505-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0335063cc96edfb2f90f546a5812c4a47d122dc7bd764b3eef48303d2bf41939 |
|
MD5 | ecb2c108be34fc745f826a4386dbd239 |
|
BLAKE2b-256 | a188f652889bc15f3faa9fd973cfa652098dfcfdd6fca6de63bbe1a418734ddb |