A library for identifying files form their magic byte signature.
Project description
pyfsig
A python library for identifying files by headers (magic bytes). You may notice that on MS windows systems files will not open properly if you change the file extension, this is becuase MS windows only pays attention to the file header. Magic bytes/file headers serve as a way to discover the type of file without using the file extension. This library was written to assist python programmers in identifying what type of file something is from its magic bytes/file header. It was originally written by the author to reconstruct BLOB's from a legacy database which did not store the original file extensions.
based on info from: 'Wikipedia - List of file signatures'
Usage
The libraries use should fairly self explainatory, here are some examples:
Find matches for a file when you have the file path
file_path = "some/file/path.abc"
matches = find_matches_for_file_path(file_path=file_path)
Find matches for a file header you have read yourself
[!IMPORTANT] The "rb" here is very important, you must read the file as "bytes"!
with open(file_path, "rb") as f:
file_header = f.read(32)
matches = find_matches_for_file_header(header=file_header)
Find matches against your own list of file signatures
You will need to create a FileSignatureDict
with atleast the following keys:
file_extension
hex
offset
custom_signatures = [
{
"file_extension": "abc",
"hex": [1, 2, 3, None, 5],
"offset": 0,
}
]
file_path = "some/file/path.abc"
matches = find_matches_for_file_path(file_path=file_path, signatures=custom_signatures)
Alternatively, you can extend the built in file signatures with your custom file signatures.
from pyfsig import SIGNATURES
extended_signatures = [
*SIGNATURES,
{
"file_extension": "abc",
"hex": [1, 2, 3, None, 5],
"offset": 0,
}
]
file_path = "some/file/path.abc"
matches = find_matches_for_file_path(file_path=file_path, signatures=extended_signatures)
Contributors
Author: Patty C (schlerp)
Thanks to all contributors!!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pyfsig-1.0.0.tar.gz
.
File metadata
- Download URL: pyfsig-1.0.0.tar.gz
- Upload date:
- Size: 9.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.12.0 Darwin/23.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa04b7850d06cb8339977a30b9aacf87e57e6d3706a7ce0f25f0184a5e27846e |
|
MD5 | 4516ccc039e83ba2c9ef19ce16d24469 |
|
BLAKE2b-256 | 27278ae929f70d09b90730518b70b2ee5f4c8c6c1dc35440f28c265e9f8e9664 |
File details
Details for the file pyfsig-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: pyfsig-1.0.0-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.12.0 Darwin/23.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 601a7af36dde703aca89d5357c974bd639bb2fa5976a4bc3a13481a40d6744a6 |
|
MD5 | bf8f03306d2d2172e1efc6e6a697cfff |
|
BLAKE2b-256 | 9681e1a98125d54dea215ebbf8e9d84aa2f52afd535bdbb22ab122f92e5ff89c |