Skip to main content

A library for identifying files form their magic byte signature.

Project description

pyfsig

A python library for identifying files by headers (magic bytes). You may notice that on MS windows systems files will not open properly if you change the file extension, this is becuase MS windows only pays attention to the file header. Magic bytes/file headers serve as a way to discover the type of file without using the file extension. This library was written to assist python programmers in identifying what type of file something is from its magic bytes/file header. It was originally written by the author to reconstruct BLOB's from a legacy database which did not store the original file extensions.

based on info from: 'Wikipedia - List of file signatures'

Usage

The libraries use should fairly self explainatory, here are some examples:

Find matches for a file when you have the file path

file_path = "some/file/path.abc"
matches = find_matches_for_file_path(file_path=file_path)

Find matches for a file header you have read yourself

[!IMPORTANT] The "rb" here is very important, you must read the file as "bytes"!

with open(file_path, "rb") as f:
    file_header = f.read(32)
matches = find_matches_for_file_header(header=file_header)

Find matches against your own list of file signatures

You will need to create a FileSignatureDict with atleast the following keys:

  • file_extension
  • hex
  • offset
custom_signatures = [
    {
        "file_extension": "abc",
        "hex": [1, 2, 3, None, 5],
        "offset": 0,
    }
]

file_path = "some/file/path.abc"
matches = find_matches_for_file_path(file_path=file_path, signatures=custom_signatures)

Alternatively, you can extend the built in file signatures with your custom file signatures.

from pyfsig import SIGNATURES

extended_signatures = [
    *SIGNATURES,
    {
        "file_extension": "abc",
        "hex": [1, 2, 3, None, 5],
        "offset": 0,
    }
]

file_path = "some/file/path.abc"
matches = find_matches_for_file_path(file_path=file_path, signatures=extended_signatures)

Contributors

Author: Patty C (schlerp)

GitHub Contributors Image

Thanks to all contributors!!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfsig-1.0.0.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

pyfsig-1.0.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file pyfsig-1.0.0.tar.gz.

File metadata

  • Download URL: pyfsig-1.0.0.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.0 Darwin/23.3.0

File hashes

Hashes for pyfsig-1.0.0.tar.gz
Algorithm Hash digest
SHA256 aa04b7850d06cb8339977a30b9aacf87e57e6d3706a7ce0f25f0184a5e27846e
MD5 4516ccc039e83ba2c9ef19ce16d24469
BLAKE2b-256 27278ae929f70d09b90730518b70b2ee5f4c8c6c1dc35440f28c265e9f8e9664

See more details on using hashes here.

File details

Details for the file pyfsig-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: pyfsig-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.0 Darwin/23.3.0

File hashes

Hashes for pyfsig-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 601a7af36dde703aca89d5357c974bd639bb2fa5976a4bc3a13481a40d6744a6
MD5 bf8f03306d2d2172e1efc6e6a697cfff
BLAKE2b-256 9681e1a98125d54dea215ebbf8e9d84aa2f52afd535bdbb22ab122f92e5ff89c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page