Skip to main content

GuessIt - a library for guessing information from video files.

Project description


.. image::
:alt: Latest Version

.. image::
:alt: License

.. image::
:alt: Build Status

.. image::
:alt: Coveralls

GuessIt is a python library that extracts as much information as
possible from a video file.

It has a very powerful filename matcher that allows to guess a lot of
metadata from a video using its filename only. This matcher works with
both movies and tv shows episodes.

For example, GuessIt can do the following::

$ guessit "Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi"
For: Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi
GuessIt found: {
[1.00] "mimetype": "video/x-msvideo",
[0.80] "episodeNumber": 3,
[0.80] "videoCodec": "XviD",
[1.00] "container": "avi",
[1.00] "format": "HDTV",
[0.70] "series": "Treme",
[0.50] "title": "Right Place, Wrong Time",
[0.80] "releaseGroup": "NoTV",
[0.80] "season": 1,
[1.00] "type": "episode"


Installing GuessIt is simple with `pip <>`_::

$ pip install guessit

or, with `easy_install <>`_::

$ easy_install guessit

But, you really `shouldn't do that <>`_.

You can now launch a demo::

$ guessit -d

and guess your own filename::

$ guessit "Breaking.Bad.S05E08.720p.MP4.BDRip.[KoTuWa].mkv"
For: Breaking.Bad.S05E08.720p.MP4.BDRip.[KoTuWa].mkv
GuessIt found: {
[1.00] "mimetype": "video/x-matroska",
[1.00] "episodeNumber": 8,
[0.30] "container": "mkv",
[1.00] "format": "BluRay",
[0.70] "series": "Breaking Bad",
[1.00] "releaseGroup": "KoTuWa",
[1.00] "screenSize": "720p",
[1.00] "season": 5,
[1.00] "type": "episode"

Filename matcher

The filename matcher is based on regular expressions and tree splitting to guess values from input filename.

It is able to find many properties, like ``title``, ``year``, ``series``, ``episodeNumber``, ``seasonNumber``,
``videoCodec``, ``screenSize``, ``language``. Guessed values are cleaned up and given in a readable format
which may not match the raw filename.

``DVDSCR`` will be guessed as ``format`` = ``DVD`` + ``other`` = ``Screener``.

``1920x1080`` will be guessed as ``screenSize`` = ``1080p``.

``DD5.1`` will be guessed as ``audioCodec`` = ``DolbyDigital`` + ``audioChannel`` = ``5.1``.

Here's the exhaustive list of properties that guessit can find:

Main properties

- **type**

Type of the file.

- ``unknown``, ``movie``, ``episode``, ``moviesubtitle``, ``episodesubtitle``

- **title**

Title of movie or episode.

- **container**

Container of the file.

- ``3g2``, ``wmv``, ``webm``, ``mp4``, ``avi``, ``mp4a``, ``mpeg``, ``sub``, ``mka``, ``m4v``, ``ts``, ``mkv``, ``ra``, ``rm``, ``wma``, ``ass``, ``mpg``, ``ram``, ``3gp``, ``ogv``, ``mov``, ``ogm``, ``asf``, ``divx``, ``ogg``, ``ssa``, ``qt``, ``idx``, ``nfo``, ``wav``, ``flv``, ``3gp2``, ``iso``, ``mk2``, ``srt``

- **date**

Date found in filename.

- **year**

Year of movie (or episode).

- **releaseGroup**

Name of (non)scene group that released the file.

- **website**

Name of website contained in the filename.

Episode properties

- **series**

Name of series.

- **season**

Season number.

- **episodeNumber**

Episode number.

- **episodeList**

List of episode numbers if several were found.

- note: If several are found, ``episodeNumber`` is the first item of this list.

- **seasonList**

List of season numbers if several were found.

- note: If several are found, ``seasonNumber`` is the first item of this list.

- **episodeCount**

Total number of episodes.

- **seasonCount**

Total number of seasons.

- **episodeDetails**

Some details about the episode.

- ``Bonus`` ``Oav`` ``Ova`` ``Omake`` ``Extras`` ``Unaired`` ``Special`` ``Pilot``

- **episodeFormat**

Episode format of the series.

- ``Minisode``

- **part**

Part number of the episode.

- **version**

Version of the episode.

- In anime fansub scene, new versions are released with tag ``<episode>v[0-9]``.

Video properties

- **format**

Format of the initial source

- ``HDTV`` ``WEB-DL`` ``TV`` ``VOD`` ``BluRay`` ``DVD`` ``WEBRip`` ``Workprint`` ``Telecine`` ``VHS`` ``DVB`` ``Telesync`` ``HD-DVD`` ``PPV`` ``Cam``

- **screenSize**

Resolution of video.
- ``720p`` ``1080p`` ``1080i`` ``<width>x<height>`` ``4K`` ``360p`` ``368p`` ``480p`` ``576p`` ``900p``

- **videoCodec**
Codec used for video.

- ``h264`` ``h265`` ``DivX`` ``XviD`` ``Real`` ``Mpeg2``

- **videoProfile**
Codec profile used for video.

- ``8bit`` ``10bit`` ``HP`` ``BP`` ``MP`` ``XP`` ``Hi422P`` ``Hi444PP``

- **videoApi**
API used for the video.

- ``DXVA``

Audio properties

- **audioChannels**

Number of channels for audio.

- ``1.0`` ``2.0`` ``5.1`` ``7.1``

- **audioCodec**
Codec used for audio.

- ``DTS`` ``TrueHD`` ``DolbyDigital`` ``AAC`` ``AC3`` ``MP3`` ``Flac``

- **audioProfile**
The codec profile used for audio.

- ``LC`` ``HQ`` ``HD`` ``HE`` ``HDMA``

Localization properties

- **Country**

Country(ies) of content. Often found in series, ``Shameless (US)`` for instance.

- ``[<babelfish.Country>]`` (This class equals name and iso code)

- **Language**

Language(s) of the audio soundtrack.

- ``[<babelfish.Language>]`` (This class equals name and iso code)

- **subtitleLanguage**

Language(s) of the subtitles.

- ``[<babelfish.Language>]`` (This class equals name and iso code)

Other properties

- **bonusNumber**

Bonus number.

- **bonusTitle**

Bonus title.

- **cdNumber**

CD number.

- **cdNumberTotal**

Total number of CD.

- **crc32**

CRC32 of the file.

- **idNumber**

Volume identifier (UUID).

- **edition**

Edition of the movie.

- ``Special Edition``, ``Collector Edition``, ``Director's cut``, ``Criterion Edition``, ``Deluxe Edition``

- **filmNumber**

Film number of this movie.

- **filmSeries**

Film series of this movie.

- **other**

Various other information.

- ``Fansub``, ``HR``, ``HQ``, ``Netflix``, ``Screener``, ``Preair``, ``Unrated``, ``HD``, ``mHD``, ``HDLight``, ``3D``,
``SyncFix``, ``Bonus``, ``WideScreen``, ``Fastsub``, ``R5``, ``AudioFix``, ``DDC``, ``Trailer``, ``Complete``,
``Limited``, ``Classic``, ``Proper``, ``DualAudio``, ``LiNE``, ``Remux``, ``PAL``, ``SECAM``, ``NTSC``

Other features

GuessIt also allows you to compute a whole lof of hashes from a file,
namely all the ones you can find in the hashlib python module (md5,
sha1, ...), but also the Media Player Classic hash that is used (amongst
others) by OpenSubtitles and SMPlayer, as well as the ed2k hash.

If you have the 'guess-language' python package installed, GuessIt can also
analyze a subtitle file's contents and detect which language it is written in.

If you have the 'enzyme' python package installed, GuessIt can also detect the
properties from the actual video file metadata.


guessit can be use from command line::

$ guessit
usage: guessit [-h] [-t TYPE] [-n] [-c] [-X DISABLED_TRANSFORMERS] [-v]
[-P SHOW_PROPERTY] [-u] [-a] [-y] [-f INPUT_FILE] [-d] [-p]
[-V] [-s] [--version] [-b] [-i INFO] [-S EXPECTED_SERIES]
[filename [filename ...]]

positional arguments:
filename Filename or release name to guess

optional arguments:
-h, --help show this help message and exit

-t TYPE, --type TYPE The suggested file type: movie, episode. If undefined,
type will be guessed.
-n, --name-only Parse files as name only. Disable folder parsing,
extension parsing, and file content analysis.
-c, --split-camel Split camel case part of filename.
Transformer to disable (can be used multiple time)
Expected series to parse (can be used multiple times)
Expected title (can be used multiple times)
-Y, --date-year-first
If short date is found, consider the first digits as
the year.
-D, --date-day-first If short date is found, consider the second digits as
the day.
Allowed language (can be used multiple times)
-E, --episode-prefer-number
Guess "serie.213.avi" as the episodeNumber 213.
Without this option, it will be guessed as season 2,
episodeNumber 13
Allowed country (can be used multiple times)
Expected release group (can be used multiple times)

-v, --verbose Display debug output
Display the value of a single property (title, series,
videoCodec, year, type ...)
-u, --unidentified Display the unidentified parts.
-a, --advanced Display advanced information for filename guesses, as
json output
-y, --yaml Display information for filename guesses as yaml
output (like unit-test)
-f INPUT_FILE, --input-file INPUT_FILE
Read filenames from an input file.
-d, --demo Run a few builtin tests instead of analyzing a file

-p, --properties Display properties that can be guessed.
-V, --values Display property values that can be guessed.
-s, --transformers Display transformers that can be used.
--version Display the guessit version.
-b, --bug Submit a wrong detection to the service

Other features:
-i INFO, --info INFO The desired information type: filename, video,
hash_mpc or a hash from python's hashlib module, such
as hash_md5, hash_sha1, ...; or a list of any of them,

It can also be used as a python module::

>>> from guessit import guess_file_info
>>> guess_file_info('Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi')
{u'mimetype': 'video/x-msvideo', u'episodeNumber': 3, u'videoCodec': u'XviD', u'container': u'avi', u'format': u'HDTV', u'series': u'Treme', u'title': u'Right Place, Wrong Time', u'releaseGroup': u'NoTV', u'season': 1, u'type': u'episode'}


The project website for GuessIt is hosted at `ReadTheDocs <>`_.
There you will also find the User guide and Developer documentation.

This project is hosted on GitHub: `<>`_

Please report issues and/or feature requests via the `bug tracker <>`_.

You can also report issues using the command-line tool::

$ guessit --bug "filename.that.fails.avi"


GuessIt is under active development, and contributions are more than welcome!

#. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug.
There is a Contributor Friendly tag for issues that should be ideal for people who are not very
familiar with the codebase yet.
#. Fork `the repository`_ on Github to start making your changes to the **master**
branch (or branch off of it).
#. Write a test which shows that the bug was fixed or that the feature works as expected.
#. Send a pull request and bug the maintainer until it gets merged and published. :)

.. _the repository:


GuessIt is licensed under the `LGPLv3 license <>`_.


0.10.3 (2015-04-04)
* Fix issues related to unicode encoding/decoding
* Fix possible crashes in guess_video_rexps
* Fix invalid guess result when crc32 contains 6 digits than can be parsed as a date

0.10.2 (2015-03-08)
* Use common words to resolve conflicts on strings
* Bump babelfish version
* Fix setuptools deprecation warning
* Package argparse dependency only if python<2.7

0.10.1 (2015-01-05)
* Avoid word Stay to be recognized as AY subtitle
* Fixed exception when no unidentified leaves remains
* Avoid usage of deprecated EntryPoint.load() require argument
* Fixed invalid raw data for some properties (title, series and maybe others)

0.10.0 (2014-12-27)
* Fixed exception when serie title starts with Ep
* Fixed exception when trying to parse a full length country name
* Removed deprecated optparse module, replaced by argparse

0.9.4 (2014-11-10)
* Fixed exception when filename contains multiple languages ISO codes
* Fixed transformers initialization logging
* Fixed possible exception in language transformer
* Added more words to common english words

0.9.3 (2014-09-14)
* Added ``Preair`` and ``Remux`` to ``other`` property
* Better detection of ``audioProfile`` = ``HD`` / ``HDMA`` for ``audioCodec`` = ``DTS``
* Better detection of ``format``` = ``BluRay`` (when followed by Rip)
* Recognize ``RC`` as ``R5``
* Recognize ``WEB-HD```and ``ẀEB`` as ``WEB-DL``

0.9.2 (2014-09-13)

* Added support of option registration on transformers
* Better detection of ``releaseGroup`` when using ``expected-series`` or ``expected-title`` option
* Better ``audioChannel`` = ``5.1`` / ``7.1`` guessing (``6ch``, ``8ch``)
* Fixed usage not showing when invalid options were passed
* Added ``PAL``, ``SECAM`` and ``NTSC`` to ``other`` possible values
* Recognize DVD-9 and DVD-5 as ``format`` = ``DVD`` property

0.9.1 (2014-09-06)

* Added ``--unidentified`` option to display unidentified parts of the filename
This option affects command line only - From API `unidentified` properties will
always be grabbed regardless this settings
* Better guessing of ``releaseGroup`` property
* Added ``mHD`` and ``HDLight`` to ``other properties``
* Better guessing of ``format`` = ``DVD`` property (DVD-R pattern)
* Some ``info`` logs changed to ``debug`` for quiet integration
* Small fixes

0.9.0 (2014-09-05)

* Better auto-detection of anime episodes, containing a ``crc32`` or a digits ``episodeNumber``.
* Better listing of options on ``guessit -h``
* Added ``--allowed-countries`` and ``--allowed-languages`` to avoid two or three
letters words to be guessed as ``country`` or ``language``
* Added ``--disabled-transformers`` option to disable transformer plugin at runtime.
* Added ``--episode-prefer-number`` option, for ``guess -t episode 'serie.123.avi'``
to return ``episodeNumber`` = ``123`` instead of ``season`` = ``1`` + ``episodeNumber`` = 23``
* Added ``--split-camel`` option (now disabled by default)
* Added ``episodeCount`` and ``seasonCount`` properties (x-of-n notation)
* Added ``--date-year-first``` and ``--date-day-first`` options
* Added ``--expected-title``, ``--expected-series`` and ``--expected-groups``
to help finding values when those properties are known
* Added ``10bit`` value to ``videoProfile``
* Added ``--show-property`` option to only show a single property
* Added ``--input-file`` option to parse a list of
* Added ``--version`` option
* Added ``ass```to subtitle extensions
* Added ``Fansub`` value for ``other`` property
* Added more date formats support with ``dateutil`` dependency
* Added customizable ``clean_function`` (API)
* Added ``default_options`` (API)
* Fixed ``--yaml`` option to support ``language`` and ``country``
* Fixed ``transformers.add_transformer()`` function (API)

0.8 (2014-07-06)

* New webservice that allows to use GuessIt just by sending a POST request to
the url
* Command-line util can now report bugs to the service
by specifying the ``-b`` or ``--bug`` flag
* GuessIt can now use the Enzyme python package to detect metadata out of the
actual video file metadata instead of the filename
* Finished transition to ``babelfish.Language`` and ``babelfish.Country``
* New property: ``duration`` which returns the duration of the video in seconds
This requires the Enzyme package to work
* New property: ``fileSize`` which returns the size of the file in bytes
* Renamed property ``special`` to ``episodeDetails``
* Added support for Python 3.4
* Optimization and bugfixes

0.7.1 (2014-03-03)

* New property "special": values can be trailer, pilot, unaired
* New options for the guessit cmdline util: ``-y``, ``--yaml`` outputs the
result in yaml format and ``-n``, ``--name-only`` analyzes the input as simple
text (instead of filename)
* Added properties formatters and validators
* Removed support for python 3.2
* A healthy amount of code cleanup/refactoring and fixes :)

0.7 (2014-01-29)

* New plugin API that allows to register custom patterns / transformers
* Uses Babelfish for language and country detection
* Added Quality API to rate file quality from guessed property values
* Better and more accurate overall detection
* Added roman and word numeral detection
* Added 'videoProfile' and 'audioProfile' property
* Moved boolean properties to 'other' property value ('is3D' became 'other' = '3D')
* Added more possible values for various properties.
* Added command line option to list available properties and values
* Fixes for Python3 support

0.6.2 (2013-11-08)

* Added support for nfo files
* GuessIt can now output advanced information as json ('-a' on the command line)
* Better language detection
* Added new property: 'is3D'

0.6.1 (2013-09-18)

* New property "idNumber" that tries to identify a hash value or a
serial number
* The usual bugfixes

0.6 (2013-07-16)

* Better packaging: unittests and doc included in source tarball
* Fixes everywhere: unicode, release group detection, language detection, ...
* A few speed optimizations

0.5.4 (2013-02-11)

* guessit can be installed as a system wide script (thanks @dplarson)
* Enhanced logging facilities
* Fixes for episode number and country detection

0.5.3 (2012-11-01)

* GuessIt can now optionally act as a wrapper around the 'guess-language' python
module, and thus provide detection of the natural language in which a body of
text is written

* Lots of fixes everywhere, mostly for properties and release group detection

0.5.2 (2012-10-02)

* Much improved auto-detection of filetype
* Fixed some issues with the detection of release groups

0.5.1 (2012-09-23)

* now detects 'country' property; also detect 'year' property for series
* more patterns and bugfixes

0.5 (2012-07-29)

* Python3 compatibility
* the usual assortment of bugfixes

0.4.2 (2012-05-19)

* added Language.tmdb language code property for TheMovieDB
* added ability to recognize list of episodes
* bugfixes for Language.__nonzero__ and episode regexps

0.4.1 (2012-05-12)

* bugfixes for unicode, paths on Windows, autodetection, and language issues

0.4 (2012-04-28)

* much improved language detection, now also detect language variants
* supports more video filetypes (thanks to Rob McMullen)

0.3.1 (2012-03-15)

* fixed package installation from PyPI
* better imports for the transformations (thanks Diaoul!)
* some small language fixes

0.3 (2012-03-12)

* fix to recognize 1080p format (thanks to Jonathan Lauwers)

0.3b2 (2012-03-02)

* fixed the package installation

0.3b1 (2012-03-01)

* refactored quite a bit, code is much cleaner now
* fixed quite a few tests
* re-vamped the documentation, wrote some more

0.2 (2011-05-27)

* new parser/matcher completely replaced the old one
* quite a few more unittests and fixes

0.2b1 (2011-05-20)

* brand new parser/matcher that is much more flexible and powerful
* lots of cleaning and a bunch of unittests

0.1 (2011-05-10)

* fixed a few minor issues & heuristics

0.1b2 (2011-03-12)

* Added PyPI trove classifiers
* fixed version number in

0.1b1 (2011-03-12)

* first pre-release version; imported from Smewt with a few enhancements already
in there.

Project details

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for guessit, version 0.10.3
Filename, size File type Python version Upload date Hashes
Filename, size guessit-0.10.3.tar.gz (141.8 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page