Skip to main content

Module and command-line tool that wraps around hashlib and zlib to facilitate generating checksums / hashes of files and directories.

Project description

https://img.shields.io/pypi/v/filehash.svg https://img.shields.io/travis/leonidessaguisagjr/filehash.svg

Python module to facilitate calculating the checksum or hash of a file. Tested against Python 2.7.x, Python 3.6.x, Python 3.7.x, Python 3.8.x, Python 3.9.x, Python 3.10.x, PyPy 2.7.x and PyPy3 3.7.x. Currently supports Adler-32, BLAKE2b, BLAKE2s, CRC32, MD5, SHA-1, SHA-224, SHA-256, SHA-384 and SHA-512.

(Note: BLAKE2b and BLAKE2s are only supported on Python 3.6.x and later.)

FileHash class

The FileHash class wraps around the hashlib (provides hashing for MD5, SHA-1, SHA-224, SHA-256, SHA-384 and SHA-512) and zlib (provides checksums for Adler-32 and CRC32) modules and contains the following methods:

  • hash_file(filename) - Calculate the file hash for a single file. Returns a string with the hex digest.

  • hash_files(filename) - Calculate the file hash for multiple files. Returns a list of tuples where each tuple contains the filename and the calculated hash.

  • hash_dir(path, pattern='*') - Calculate the file hashes for an entire directory. Returns a list of tuples where each tuple contains the filename and the calculated hash.

  • cathash_files(filenames) - Calculate a single hash for multiple files. Files are sorted by their individual hash values and then traversed in that order to generate a combined hash value. Returns a string with the hex digest.

  • cathash_dir(path, pattern='*') - Calculate a single hash for an entire directory of files. Files are sorted by their individual hash values and then traversed in that order to generate a combined hash value. Returns a string with the hex digest.

  • verify_sfv(sfv_filename) - Reads the specified SFV (Simple File Verification) file and calculates the CRC32 checksum for the files listed, comparing the calculated CRC32 checksums against the specified expected checksums. Returns a list of tuples where each tuple contains the filename and a boolean value indicating if the calculated CRC32 checksum matches the expected CRC32 checksum. To find out more about SFV files, see the Simple file verification entry in Wikipedia.

  • verify_checksums(checksum_filename) - Reads the specified file and calculates the hashes for the files listed, comparing the calculated hashes against the specified expected hashes. Returns a list of tuples where each tuple contains the filename and a boolean value indicating if the calculated hash matches the expected hash.

For the checksum file, the file is expected to be a plain text file where each line has an entry formatted as follows:

{hash}[SPACE][ASTERISK]{filename}

This format is the format used by programs such as the sha1sum family of tools for generating checksum files. Here is an example generated by sha1sum:

f7ef3b7afaf1518032da1b832436ef3bbfd4e6f0 *lorem_ipsum.txt
03da86258449317e8834a54cf8c4d5b41e7c7128 *lorem_ipsum.zip

The FileHash constructor has two optional arguments:

  • hash_algorithm='sha256' - Specifies the hashing algorithm to use. See filehash.SUPPORTED_ALGORITHMS for the list of supported hash / checksum algorithms. Defaults to SHA256.

  • chunk_size=4096 - Integer specifying the chunk size to use (in bytes) when reading the file. This comes in useful when processing very large files to avoid having to read the entire file into memory all at once. Default chunk size is 4096 bytes.

Example usage

The library can be used as follows:

>>> import os
>>> from filehash import FileHash
>>> md5hasher = FileHash('md5')
>>> md5hasher.hash_file("./testdata/lorem_ipsum.txt")
'72f5d9e3a5fa2f2e591487ae02489388'
>>> sha1hasher = FileHash('sha1')
>>> sha1hasher.hash_dir("./testdata", "*.zip")
[FileHashResult(filename='lorem_ipsum.zip', hash='03da86258449317e8834a54cf8c4d5b41e7c7128')]
>>> sha512hasher = FileHash('sha512')
>>> os.chdir("./testdata")
>>> sha512hasher.verify_checksums("./hashes.sha512")
[VerifyHashResult(filename='lorem_ipsum.txt', hashes_match=True), VerifyHashResult(filename='lorem_ipsum.zip', hashes_match=True)]
>>> crc32hasher = FileHash('crc32')
>>> crc32hasher.verify_sfv("./lorem_ipsum.sfv")
[VerifyHashResult(filename='lorem_ipsum.txt', hashes_match=True), VerifyHashResult(filename='lorem_ipsum.zip', hashes_match=True)]

chkfilehash command line tool

A command-line tool called chkfilehash is also included with the filehash package. Here is an example of how the tool can be used:

$ chkfilehash -a sha512 -c hashes.sha512
lorem_ipsum.txt: OK
lorem_ipsum.zip: OK
$ chkfilehash -a crc32 lorem_ipsum.zip
7425D3BE *lorem_ipsum.zip
$

Run the tool without any parameters or with the -h / --help switch to get a usage screen.

License

This is released under an MIT license. See the LICENSE file in this repository for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filehash-0.2.dev1.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

filehash-0.2.dev1-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file filehash-0.2.dev1.tar.gz.

File metadata

  • Download URL: filehash-0.2.dev1.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.0

File hashes

Hashes for filehash-0.2.dev1.tar.gz
Algorithm Hash digest
SHA256 936e5693ea044cd30642f4697f785dc69a1d9cc7301dedfcb096cf3dbd999016
MD5 d056e2925a8a308388d5213fe0346b5e
BLAKE2b-256 c68a6b3962ea24a401a18dee2dbd35ac8dc492a50293111a569b15d5b3db84f0

See more details on using hashes here.

File details

Details for the file filehash-0.2.dev1-py3-none-any.whl.

File metadata

  • Download URL: filehash-0.2.dev1-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.0

File hashes

Hashes for filehash-0.2.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 47e1bf688511cfeae145e29616f188c550ef6a516ef8d116535d87ddaa21ea11
MD5 83d157d2c91998817fb6559a53549d54
BLAKE2b-256 c985bcef52891917a7b2986ec6774e11e28511367613b39d574d3ce2252fd126

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page