Skip to main content

Speedy library to detect file type from initial part of file content

Project description

image love image pypi Documentation Status

Speedy Python library to determine MIME type of file.

logo

Defity (Detect file type) is a library for Python application to guess file type in a reliable way, not based on filename extension ( *.png, *.pdf), but on actual file content. It is like what file command and libmagic library do, but with different strategy.

📕 Documentation: defity.readthedocs.io

Install

$ pip install defity

Usage

>>> import defity
>>> defity.from_file('path/to/landscape.png')
'image/png'
>>> with open('path/to/landscape.png', 'rb') as f:
...     defity.from_file(f)
...
'image/png'

>>> defity.from_bytes(b'some-binary-content')
'image/png'

How different with libmagic-based ones?

There are many Python libraries also do the same thing, most of them are based on wellknown libmagic. Defity is based on Rust tree_magic_mini library, which in turn is a fork of tree_magic , another Rust library. Quote from tree_magic to see how it differs from libmagic:

Unlike the typical approach that libmagic and file(1) uses, this loads all the file types in a tree based on subclasses. (EX: application/vnd.openxmlformats-officedocument.wordprocessingml.document (MS Office 2007) subclasses application/zip which subclasses application/octet-stream) Then, instead of checking the file against every file type, it can traverse down the tree and only check the file types that make sense to check. (After all, the fastest check is the check that never gets run.)

This library also provides the ability to check if a file is a certain type without going through the process of checking it against every file type.

And what tree_magic_mini has improved over tree_magic:

Reduced copying and memory allocation, for a slight increase in speed and decrease in memory use.

So, Defity should have better performance than other libraries for the same purpose.

Another advantage is that, Defity is static linked to the underlying Rust library, not depend on discrete libmagic.so. It will be easier to deploy to cloud function platforms, where you don’t have control over what system libraries is present there.

License

In general, Defity is licensed under Apache-2.0 if it is built without tree_magic_mini embedded MIME database, and is licensed under GPL-3.0 otherwise. Concretely:

  • On Linux, it is licensed under Apache-2.0.

  • On Windows and MacOS, it is licensed under GPL-3.0.

It is because, Linux boxes already come with FreeDesktop’s MIME database, Defity just uses it. Windows and MacOS don’t have this database and Defity has to embed with it.

Credit

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

defity-0.3.1.tar.gz (31.1 kB view hashes)

Uploaded Source

Built Distributions

defity-0.3.1-cp311-none-win_amd64.whl (344.8 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

defity-0.3.1-cp311-cp311-manylinux_2_28_x86_64.whl (244.8 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.28+ x86-64

defity-0.3.1-cp311-cp311-macosx_10_7_x86_64.whl (225.9 kB view hashes)

Uploaded CPython 3.11 macOS 10.7+ x86-64

defity-0.3.1-cp310-none-win_amd64.whl (344.8 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

defity-0.3.1-cp310-cp310-manylinux_2_28_x86_64.whl (244.8 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.28+ x86-64

defity-0.3.1-cp310-cp310-macosx_10_7_x86_64.whl (225.9 kB view hashes)

Uploaded CPython 3.10 macOS 10.7+ x86-64

defity-0.3.1-cp39-none-win_amd64.whl (344.8 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

defity-0.3.1-cp39-cp39-manylinux_2_28_x86_64.whl (244.8 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.28+ x86-64

defity-0.3.1-cp39-cp39-macosx_10_7_x86_64.whl (225.9 kB view hashes)

Uploaded CPython 3.9 macOS 10.7+ x86-64

defity-0.3.1-cp38-none-win_amd64.whl (345.3 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

defity-0.3.1-cp38-cp38-manylinux_2_28_x86_64.whl (245.2 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.28+ x86-64

defity-0.3.1-cp38-cp38-macosx_10_7_x86_64.whl (226.3 kB view hashes)

Uploaded CPython 3.8 macOS 10.7+ x86-64

defity-0.3.1-cp37-none-win_amd64.whl (345.2 kB view hashes)

Uploaded CPython 3.7 Windows x86-64

defity-0.3.1-cp37-cp37m-manylinux_2_28_x86_64.whl (245.2 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.28+ x86-64

defity-0.3.1-cp37-cp37m-macosx_10_7_x86_64.whl (226.3 kB view hashes)

Uploaded CPython 3.7m macOS 10.7+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page