Skip to main content

File type identification using libmagic

Project description

python-magic

PyPI version Build Status

python-magic is a Python interface to the libmagic file type identification library. libmagic identifies file types by checking their headers according to a predefined list of file types. This functionality is exposed to the command line by the Unix command file.

Usage

>>> import magic
>>> magic.from_file("testdata/test.pdf")
'PDF document, version 1.2'
# recommend using at least the first 2048 bytes, as less can produce incorrect identification
>>> magic.from_buffer(open("testdata/test.pdf").read(2048)) 
'PDF document, version 1.2'
>>> magic.from_file("testdata/test.pdf", mime=True)
'application/pdf'

There is also a Magic class that provides more direct control, including overriding the magic database file and turning on character encoding detection. This is not recommended for general use. In particular, it's not safe for sharing across multiple threads and will fail throw if this is attempted.

>>> f = magic.Magic(uncompress=True)
>>> f.from_file('testdata/test.gz')
'ASCII text (gzip compressed data, was "test", last modified: Sat Jun 28
21:32:52 2008, from Unix)'

You can also combine the flag options:

>>> f = magic.Magic(mime=True, uncompress=True)
>>> f.from_file('testdata/test.gz')
'text/plain'

Installation

The current stable version of python-magic is available on PyPI and can be installed by running pip install python-magic.

Other sources:

This module is a simple wrapper around the libmagic C library, and that must be installed as well:

Debian/Ubuntu

$ sudo apt-get install libmagic1

Windows

You'll need DLLs for libmagic. @julian-r has uploaded a version of this project that includes binaries to PyPI: https://pypi.python.org/pypi/python-magic-bin/0.4.14

Other sources of the libraries in the past have been File for Windows . You will need to copy the file magic out of [binary-zip]\share\misc, and pass its location to Magic(magic_file=...).

If you are using a 64-bit build of python, you'll need 64-bit libmagic binaries which can be found here: https://github.com/pidydx/libmagicwin64. Newer version can be found here: https://github.com/nscaife/file-windows.

OSX

  • When using Homebrew: brew install libmagic
  • When using macports: port install file

Troubleshooting

  • 'MagicException: could not find any magic files!': some installations of libmagic do not correctly point to their magic database file. Try specifying the path to the file explicitly in the constructor: magic.Magic(magic_file="path_to_magic_file").

  • 'WindowsError: [Error 193] %1 is not a valid Win32 application': Attempting to run the 32-bit libmagic DLL in a 64-bit build of python will fail with this error. Here are 64-bit builds of libmagic for windows: https://github.com/pidydx/libmagicwin64

  • 'WindowsError: exception: access violation writing 0x00000000 ' This may indicate you are mixing Windows Python and Cygwin Python. Make sure your libmagic and python builds are consistent.

Bug Reports

python-magic is a thin layer over the libmagic C library. Historically, most bugs that have been reported against python-magic are actually bugs in libmagic; libmagic bugs can be reported on their tracker here: https://bugs.astron.com/my_view_page.php. If you're not sure where the bug lies feel free to file an issue on GitHub and I can triage it.

Running the tests

To run the tests across 3 recent Ubuntu LTS releases (depends on Docker):

$ ./test_docker.sh

To run tests locally across all available python versions:

$ ./test/run.py

To run against a specific python version:

$ LC_ALL=en_US.UTF-8 python3 test/test.py

Versioning

Minor version bumps should be backwards compatible. Major bumps are not.

Name Conflict

There are, sadly, two libraries which use the module name magic. Both have been around for quite a while. If you are using this module and get an error using a method like open, your code is expecting the other one. Hopefully one day these will be reconciled.

Author

Written by Adam Hupp in 2001 for a project that never got off the ground. It originally used SWIG for the C library bindings, but switched to ctypes once that was part of the python standard library.

You can contact me via my website or GitHub.

Contributors

Thanks to these folks on github who submitted features and bug fixes.

License

python-magic is distributed under the MIT license. See the included LICENSE file for details.

I am providing code in the repository to you under an open source license. Because this is my personal repository, the license you receive to my code is from me and not my employer (Facebook).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-magic-0.4.18.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

python_magic-0.4.18-py2.py3-none-any.whl (8.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file python-magic-0.4.18.tar.gz.

File metadata

  • Download URL: python-magic-0.4.18.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3rc1

File hashes

Hashes for python-magic-0.4.18.tar.gz
Algorithm Hash digest
SHA256 b757db2a5289ea3f1ced9e60f072965243ea43a2221430048fd8cacab17be0ce
MD5 5edc6caa39cc62641850f6b1b6f284ba
BLAKE2b-256 e3851aff76b966622868a73717abd8b501a3c91890e23a65e5f574ff6df1970f

See more details on using hashes here.

File details

Details for the file python_magic-0.4.18-py2.py3-none-any.whl.

File metadata

  • Download URL: python_magic-0.4.18-py2.py3-none-any.whl
  • Upload date:
  • Size: 8.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3rc1

File hashes

Hashes for python_magic-0.4.18-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 356efa93c8899047d1eb7d3eb91e871ba2f5b1376edbaf4cc305e3c872207355
MD5 b42529207f82c8d5abe250decb1cfa3f
BLAKE2b-256 5977c76dc35249df428ce2c38a3196e2b2e8f9d2f847a8ca1d4d7a3973c28601

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page