Skip to main content

Automatic File Identifier

Project description

Automatic File Identifier

A simple program written to identify a file using the extension or using file signatures.
Failing to do either, implies that the file might be a text file with no signature or extension.

It is not intended to be a final solution that finds everything about a file.
i.e. the purpose is to provide mimes for popular identifiable formats.

It is to be integrated with another project called Advanced Pattern Finder.

Installing

pip install auto-files-identifier

Or, clone the repository at https://github.com/supratikchatterjee16/afi, and

pip install .

Using

There is a signature downloader, which is tailored to download and store information found on File Signatures.
This creates a python file named signatures.py, which needs to be in the same folder as the program.
The program imports this and makes use of it.

import afi
list_mime_tuples = afi.identify(path_to_resource)#can be a folder or file path

# The above gives you a list of tuples which contain the mimes for each identified file
# It identifies using both extension and signatures

However, there is a second way of making use of it as well.

afi /path/to/file/or/folder

Author

Supratik Chatterjee

Issues

There are some(quite a lot I suspect) popular file extensions and signatures missing.
I believe there are way more than 650 extensions.

Should one find something missing, kindly do let me know.

Downloading

mkdir afi
cd ./afi
git init
git remote add origin https://github.com/supratikchatterjee16/afi.git
git pull origin master
pip install .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_files_identifier-0.0.3.tar.gz (17.4 kB view details)

Uploaded Source

File details

Details for the file auto_files_identifier-0.0.3.tar.gz.

File metadata

  • Download URL: auto_files_identifier-0.0.3.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.5

File hashes

Hashes for auto_files_identifier-0.0.3.tar.gz
Algorithm Hash digest
SHA256 ff8be24aa5dddb0951efb8c339456d2d0daf4db7346d3bb0b87e54b5ffee8974
MD5 2c040c5a49c7bde0ddf30e39d9d3e05d
BLAKE2b-256 3741633dfb5d6f50e99656419294f0cf66ee1cece8b6e0f71f7626d46a649cb8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page