Skip to main content

GuessIt - a library for guessing information from video filenames.

Project description

GuessIt
=======

.. image:: http://img.shields.io/pypi/v/guessit.svg
:target: https://pypi.python.org/pypi/guessit
:alt: Latest Version

.. image:: http://img.shields.io/badge/license-LGPLv3-blue.svg
:target: https://pypi.python.org/pypi/guessit
:alt: LGPLv3 License

.. image:: http://img.shields.io/travis/guessit-io/guessit.svg
:target: https://travis-ci.org/guessit-io/guessit
:alt: Build Status

.. image:: http://img.shields.io/coveralls/guessit-io/guessit/master.svg
:target: https://coveralls.io/github/guessit-io/guessit?branch=master
:alt: Coveralls

.. image:: https://img.shields.io/badge/Hu-Board-7965cc.svg
:target: https://huboard.com/guessit-io/guessit
:alt: HuBoard


GuessIt is a python library that extracts as much information as possible from a video filename.

It has a very powerful matcher that allows to guess properties from a video using its filename only.
This matcher works with both movies and tv shows episodes.

For example, GuessIt can do the following::

$ guessit "Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi"
For: Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi
GuessIt found: {
"title": "Treme",
"season": 1,
"episode": 3,
"episode_title": "Right Place, Wrong Time",
"source": "HDTV",
"video_codec": "Xvid",
"release_group": "NoTV",
"container": "avi",
"mimetype": "video/x-msvideo",
"type": "episode"
}

More information are available at `ReadTheDocs <http://guessit.readthedocs.org/>`_.

Migration note
--------------
GuessIt 2 has been rewriten from scratch. GuessIt is now a release name parser only, and support for additional
features like hashes computations has been dropped.

To migrate from guessit ``0.x`` or ``1.x`` to ``guessit 2.x``, please read
`migration.rst <https://github.com/guessit-io/guessit/blob/master/docs/migration.rst>`_.

To migrate from guessit ``2.x`` to ``3.x``, please read `migration2to3.rst <https://github.com/guessit-io/guessit/blob/master/docs/migration2to3.rst>`_.

Install
-------

Installing GuessIt is simple with `pip <http://www.pip-installer.org/>`_::

$ pip install guessit

You can also `install GuessIt from sources <https://github.com/guessit-io/guessit/blob/master/docs/sources.rst>`_

Usage
-----

GuessIt can be used from command line::

$ guessit
usage: guessit [-h] [-t TYPE] [-n] [-Y] [-D] [-L ALLOWED_LANGUAGES]
[-C ALLOWED_COUNTRIES] [-E] [-T EXPECTED_TITLE]
[-G EXPECTED_GROUP] [--includes INCLUDES]
[--excludes EXCLUDES] [-f INPUT_FILE] [-v]
[-P SHOW_PROPERTY] [-a] [-s] [-l] [-j] [-y] [-c CONFIG]
[--no-user-config] [--no-default-config] [-p] [-V]
[--version]
[filename [filename ...]]

positional arguments:
filename Filename or release name to guess

optional arguments:
-h, --help show this help message and exit

Naming:
-t TYPE, --type TYPE The suggested file type: movie, episode. If undefined,
type will be guessed.
-n, --name-only Parse files as name only, considering "/" and "\" like
other separators.
-Y, --date-year-first
If short date is found, consider the first digits as
the year.
-D, --date-day-first If short date is found, consider the second digits as
the day.
-L ALLOWED_LANGUAGES, --allowed-languages ALLOWED_LANGUAGES
Allowed language (can be used multiple times)
-C ALLOWED_COUNTRIES, --allowed-countries ALLOWED_COUNTRIES
Allowed country (can be used multiple times)
-E, --episode-prefer-number
Guess "serie.213.avi" as the episode 213. Without this
option, it will be guessed as season 2, episode 13
-T EXPECTED_TITLE, --expected-title EXPECTED_TITLE
Expected title to parse (can be used multiple times)
-G EXPECTED_GROUP, --expected-group EXPECTED_GROUP
Expected release group (can be used multiple times)
--includes INCLUDES List of properties to be detected
--excludes EXCLUDES List of properties to be ignored

Input:
-f INPUT_FILE, --input-file INPUT_FILE
Read filenames from an input text file. File should
use UTF-8 charset.

Output:
-v, --verbose Display debug output
-P SHOW_PROPERTY, --show-property SHOW_PROPERTY
Display the value of a single property (title, series,
video_codec, year, ...)
-a, --advanced Display advanced information for filename guesses, as
json output
-s, --single-value Keep only first value found for each property
-l, --enforce-list Wrap each found value in a list even when property has
a single value
-j, --json Display information for filename guesses as json
output
-y, --yaml Display information for filename guesses as yaml
output

Configuration:
-c CONFIG, --config CONFIG
Filepath to configuration file. Configuration file
contains the same options as those from command line
options, but option names have "-" characters replaced
with "_". This configuration will be merged with
default and user configuration files.
--no-user-config Disable user configuration. If not defined, guessit
tries to read configuration files at
~/.guessit/options.(json|yml|yaml) and
~/.config/guessit/options.(json|yml|yaml)
--no-default-config Disable default configuration. This should be done
only if you are providing a full configuration through
user configuration or --config option. If no
"advanced_config" is provided by another configuration
file, it will still be loaded from default
configuration.

Information:
-p, --properties Display properties that can be guessed.
-V, --values Display property values that can be guessed.
--version Display the guessit version.


It can also be used as a python module::

>>> from guessit import guessit
>>> guessit('Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi')
MatchesDict([('title', 'Treme'), ('season', 1), ('episode', 3), ('episode_title', 'Right Place, Wrong Time'), ('source', 'HDTV'), ('video_codec', 'Xvid'), ('release_group', 'NoTV'), ('container', 'avi'), ('mimetype', 'video/x-msvideo'), ('type', 'episode')])

``MatchesDict`` is a dict that keeps matches ordering.

Command line options can be given as dict or string to the second argument.

Docker
------

GuessIt is also available on `Docker Hub <https://hub.docker.com/r/guessit/guessit/>`_ as a Docker Image.::

$ docker run -it guessit/guessit "Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi"
For: Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi
GuessIt found: {
"title": "Treme",
"season": 1,
"episode": 3,
"episode_title": "Right Place, Wrong Time",
"source": "HDTV",
"video_codec": "Xvid",
"release_group": "NoTV",
"container": "avi",
"mimetype": "video/x-msvideo",
"type": "episode"
}

Support
-------

This project is hosted on `GitHub <https://github.com/guessit-io/guessit>`_. Feel free to open an issue if you think you
have found a bug or something is missing in guessit.

GuessIt relies on `Rebulk <https://github.com/Toilal/rebulk>`_ project for pattern and rules registration.

License
-------

GuessIt is licensed under the `LGPLv3 license <http://www.gnu.org/licenses/lgpl.html>`_.


History
=======

3.0.3 (2018-10-23)
------------------

- Add ``MP2`` ``audio_codec`` value.
- ``Proper`` and ``Fix`` have been separated in two distinct ``other`` values.
- Add ``1e18`` ``season``/``episode`` pattern.
- Fix false ``release_group`` matches with ``--expected-title`` option.
- Fix parent folder ending with a digit detected as title
- Fix wrong season/year with ``--type episode `` option. Serie(s) keyword has been removed from default configuration.
- Fix missing property when ``episode_details`` pattern appears in title.


3.0.2 (2018-10-18)
------------------

- Ensure consistent behavior between CLI and Python module. It's now possible to override `advanced_config` at
runtime through options dict. Rebulk rules are lazily rebuilt when `advanced_config` is changed since previous
call.
- Refactored command line options and loading behavior related to configuration files (see `-c CONFIG`,
`--config CONFIG`, `--no-user-config`, `--no-default-config`)

3.0.1 (2018-10-17)
------------------

- Removed `Extras` and `Bonus` values from `episode_details` property as those tags may also appear in movies
- Add `Scalable Video Coding`, `Advanced Video Codec High Definition` and `High Efficiency Video Coding` values to
`video_profile`
- Add support for Python 3.7
- Add `mk3d` value to `container`
- Better title cleanup containing acronyms (like `Marvel's Agents of S.H.I.E.L.D`)
- Fix issue with ES audio_profile breaking titles
- Fix crash for files ending with `Rip`


3.0.0 (2018-05-22)
------------------

- Renamed `format` property to `source`.
- `source` property `Cam` is now `Camera` or `HD Camera`
- `source` property `Telesync` is now `Telesync` or `HD Telesync`
- `source` property `PPV` is now `Pay-per-view`
- `source` property `DVB` is now `Digital TV`
- `source` property `VOD` is now `Video on Demand`
- `source` property `WEBRip` is now `Web` with additional property `other`: `Rip`
- `source` property `WEB-DL` is now `Web`
- `source` property `AHDTV` is now `Analog HDTV`
- `source` property `UHDTV` is now `Ultra HDTV`
- `source` property `HDTC` is now `HD Telecine`
- `screen_size` property `360i` was added.
- `screen_size` property `480i` was added.
- `screen_size` property `576i` was added.
- `screen_size` property `900i` was added.
- `screen_size` property `1440p` was added.
- `screen_size` property `4K` is now `2160p`
- `screen_size` property `4320p` was added.
- `video_codec` property `h264` is now `H.264`
- `video_codec` property `h265` is now `H.265`
- `video_codec` property `Mpeg2` is now `MPEG-2`
- `video_codec` property `Real` is now `RealVideo`
- `video_codec` property `XviD` is now `Xvid`
- `video_profile` property `BP` is now `Baseline`.
- `video_profile` property `HP` is now `High`.
- `video_profile` property `XP` is now `Extended`.
- `video_profile` property `MP` is now `Main`.
- `video_profile` property `Hi422P` is now `High 4:2:2`.
- `video_profile` property `Hi444PP` is now `High 4:4:4 Predictive`.
- `video_profile` property `High 10` was added.
- `video_profile` property `8bit` was removed. `8bit` is detected as `color_depth`: `8-bit`
- `video_profile` property `10bit` was removed. `10bit` is detected as `color_depth`: `10-bit`
- `audio_codec` property `DTS-HD` was added.
- `audio_codec` property `AC3` is now `Dolby Digital`
- `audio_codec` property `EAC3` is now `Dolby Digital Plus`
- `audio_codec` property `TrueHD` is now `Dolby TrueHD`
- `audio_codec` property `DolbyAtmos` is now `Dolby Atmos`.
- `audio_profile` property `HE` is now `High Efficiency`.
- `audio_profile` property `LC` is now `Low Complexity`.
- `audio_profile` property `HQ` is now `High Quality`.
- `audio_profile` property `HDMA` is now `Master Audio`.
- `edition` property `Collector Edition` is now `Collector`
- `edition` property `Special Edition` is now `Special`
- `edition` property `Criterion Edition` is now `Criterion`
- `edition` property `Deluxe Edition` is now `Deluxe`
- `edition` property `Limited Edition` is now `Limited`
- `edition` property `Theatrical Edition` is now `Theatrical`
- `edition` property `Director's Definitive Cut` was added.
- `episode_details` property `Oav` and `Ova` were removed. They are now `other: Original Animated Video`
- `episode_details` property `Omake` is now `Extras`
- `episode_details` property `Final` was added.
- `other` property `Rip` was added.
- `other` property `DDC` was removed. `DDC` is now `edition`: `Director's Definitive Cut`
- `other` property `CC` was removed. `CC` is now `edition`: `Criterion`
- `other` property `FINAL` was removed. `FINAL` is now `episode_details`: `Final`
- `other` property `Original Animated Video` was added.
- `other` property `OV` is now `Original Video`
- `other` property `AudioFix` is now `Audio Fixed`
- `other` property `SyncFix` is now `Sync Fixed`
- `other` property `DualAudio` is now `Dual Audio`
- `other` property `Fansub` is now `Fan Subtitled`
- `other` property `Fastsub` is now `Fast Subtitled`
- `other` property `FullHD` is now `Full HD`
- `other` property `UltraHD` is now `Ultra HD`
- `other` property `mHD` and `HDLight` are now `Micro HD`
- `other` property `HQ` is now `High Quality`
- `other` property `HR` is now `High Resolution`
- `other` property `LD` is now `Line Dubbed`
- `other` property `MD` is now `Mic Dubbed`
- `other` property `Low Definition` was added.
- `other` property `LiNE` is now `Line Audio`
- `other` property `R5` is now `Region 5`
- `other` property `Region C` was added.
- `other` property `ReEncoded` is now `Reencoded`
- `other` property `WideScreen` is now `Widescreen`
- Added `Ultra HD Blu-ray` as new `source` possible value.
- Added `Standard Dynamic Range` as new `other` possible value.
- Added `HDR10` as new `other` possible value.
- Added `Dolby Vision` as new `other` possible value.
- Added `BT.2020` as new `other` possible value.
- Added `12-bit` as new `color_depth` possible value.
- Added `IMAX` as new `edition` possible value.
- Added `Upscaled` as new `other` possible value.
- Added `High Frame Rate` as new `other` possible value.
- Added `Ultimate` as new `edition` possible value.
- Added `Fan` as new `edition` possible value.
- Added `High Resolution Audio` as new `audio_profile` possible value.
- Added `Extended Surround` as new `audio_profile` possible value.
- Added `EX` as new `audio_profile` possible value
- Added `Opus` as new `audio_codec` possible value
- Added `aspect_ratio` as new property. Also used to validate if a screen_size is a standard resolution.
- Fixed unwanted language and country detection for exotic languages.
- Added default and configurable list of allowed languages and countries
- Added `VC-1` as new `video_codec` possible value
- Enhanced dash-separated `release_group` detection.
- Changed `size` output to return `guessit.Quantity` object.
- Changed `size` output to return `guessit.Size` object.
- Added `audio_video_rate` as new possible property.
- Added `video_video_rate` as new possible property.
- Added `frame_rate` as new possible property.
- Added `disc` as a new possible property.
- Added `H.263` as new `video_codec` possible value.
- Added `VP7` as new `video_codec` possible value.
- Added `VP8` as new `video_codec` possible value.
- Added `VP9` as new `video_codec` possible value.
- Added `Vorbis` as new `audio_codec` possible value.
- Added `PCM` as new `audio_codec` possible value.
- Added `LPCM` as new `audio_codec` possible value.
- Added `Digital Master` as new `source` possible value.
- Added several new values for `streaming_service`.
- Added new options `--includes` and `--excludes`.
- Added `Sample` as new `other` possible value.
- Added `Obfuscated` as new `other` possible value.
- Added `Proof` as new `other` possible value.
- Added `Repost` as new `other` possible value.
- Added advanced guessit configuration to config files.
- Add support for `pathlib.Path` objects on guessit API input.

2.1.4 (2017-06-01)
------------------

- Fix broken match function when using `rebulk>=0.9.0`.

2.1.3 (2017-05-31)
------------------

- Add `nzb` as new `container` possible value
- Add `EAC3` as new `audio_codec` possible value
- Add `FullHD` as new `other` possible value
- Added python 3.6 support
- Dropped python 2.6 support
- Make `container` values consistent and always lowercase
- Fix `--type movie` being ignored for movies that starts with numbers
- Fix invalid `language` detection due the common words `audio`, `true` and `unknown`
- Fix `episode` type detection when series name contains `year` followed by SEE pattern

2.1.2 (2017-04-03)
------------------

- Many fixes, additions and improvements (thanks to @ratoaq2).

2.1.1 (2016-12-04)
------------------

- Add `~` to episode/season separators.
- Add `AHDTV`, `HDTC`, `SATRip` as new `format` possible values.
- Add `streaming_service` property.
- Add `DDP` pattern as `audio_codec`=`DolbyDigital`.
- Add `LDTV` as possible tag for `other`=`LD`.
- Add `StripSeparators` Post Processor to strip separators from all matches.
- Fix invalid guess `1 x 2` with `--type episode`.
- Fix `part` property.
- Fix `cd_count` issue with `x264-CD`.
- Fix `HDD` group detected as `DolbyDigital`.
- Fix invalid comparator in `audio_codec` conflict solver.
- Fix validation of `film` property.
- Fix `date` followed by `screen_size` invalid guess.
- Fix `episode` not detected when smaller filepart repeats the `season` and uses `SSEE` pattern.
- Enhance `season`/`episode` conflict solver to keep most specific value.
- Enhance `video_profile` detection.
- Enhance `episode`/`season` range and sequence guessing.
- Enhance performance with rebulk upgrade to `0.8.2`.
- Enhance `season`/`episode`.
- Enhance `other`=`Complete` guessing.
- Enhance `release_group` guessing.
- Enhance command line options parsing related to unicode.
- Ensure roman numbers are surrounded with separators to be guessed as numbers.

2.1.0 (2016-09-08)
------------------

- Drop support for `regex` native module.
- Remove dependency constraint on `python-dateutil`.
- Enhance langage/country guessing in edge cases.
- Enhance rule to guess `release_group` in more file templates.
- Fix edge cases for subtitle language detection.
- Fix invalid conflict solving in `season`/`episode` occuring between `SssEee` and `ssXee` pattern.
- Fix issue when running guessit in non-interactive shell with python 2
- Guess Dolby keyword as DolbyDigital in `audio_codec`.
- Avoid `title` to be guessed as `website` (Dark.Net)
- Avoid `season`/`episode` to be guessed when pattern is included inside words.
- Enhance `screen_size` to detect `720pHD` and `1080pHD`
- Add support for `format` and `video_codec` when no separators between themselves. (HDTVx264, PDTVx264, ...)
- Add rebulk version in `--version` option.
- Upgrade rebulk to `0.7.3`.

2.0.5 (2016-04-10)
------------------

- Fix inconsistent properties returned by guessit -p.
- Add support for titles containing dots.
- Lock python-dateutil dependency to <2.5.2.

2.0.4 (2016-02-03)
------------------

- Add an Exception Report when an unexpected exception occurs.


2.0.3 (2016-01-30)
------------------

- Something goes wrong with 2.0.2 release ...


2.0.2 (2016-01-30)
------------------

- Fix possible issue with unicode characters encoding/decoding.
- Pypy is now supported.


2.0.1 (2016-01-28)
------------------

- Add support for any type of string with python 2 and python 3 (binary, str, unicode).


2.0.0 (2016-01-27)
------------------

- Final release.


2.0rc8 (2016-01-26)
-------------------

- Remove regex native module from required dependencies. It will now be used only if present.


2.0rc7 (2016-01-18)
-------------------

- Fix packaging issues on Python 2.7.


2.0rc6 (2016-01-18)
-------------------

- Fix packaging issues.


2.0rc5 (2016-01-18)
-------------------

- Guessit is now available as a docker container on Docker Hub (https://hub.docker.com/r/toilal/guessit).
- `country` 2-letter code is not added to `title` value anymore.
- All `container` values are now capitalized.
- `alternativeTitle` has been renamed to `alternative_title` and added to the docs.
- `mimetype` property is now in the docs.
- Add more excluded words for `language` property.
- Add more possible values for `other` property.
- Fix an issue occuring with `title` values starting with `Scr`.
- `film` property is now guessed only if less than `100` to avoid possible conflicts with `crc32`.


2.0rc4 (2015-12-03)
-------------------

- Add docs.
- Add exotic `screen_size` patterns support like `720hd` and `720p50`.
- Rename `audio_codec` value `true-HD` to `trueHD`.


2.0rc3 (2015-11-29)
-------------------

- Add ``__version__`` to main module.


2.0rc2 (2015-11-28)
-------------------

- Single digit episodes are now guessed for ``--type episode`` instead of ``--episode-prefer-number``.
- Fix separators that could cause some titles to be splited with & and ;.
- Avoid possible ``NoneType`` error.


2.0rc1 (2015-11-27)
-------------------

- Fallback to default title guessing when ``expected-title`` is not found.


2.0b4 (2015-11-24)
------------------

- Add ``expected-group`` option.
- Add validation rule for single digit ``episode`` to avoid false positives.
- Add ``verbose`` option.
- Fix ``expected-title`` option.
- Better unicode support in ``expected-group``/``expected-title`` option.


2.0b3 (2015-11-15)
------------------

- Add support for ``part`` with no space before number.
- Avoid ``uuid`` and ``crc32`` collision with ``season``/``episode`` properties.
- Add better space support for ``season``/``episode`` properties.
- Ensure ``date`` property is found when conflicting with ``season``/``episode`` properties.
- Fix ``IndexError`` when input has a closing group character with no opening one before.
- Add ``--type`` option.
- Add rebulk implicit option support.

2.0b2 (2015-11-14)
------------------

- Add python 2.6 support.


2.0b1 (2015-11-11)
------------------

- Enhance title guessing.
- Upgrade rebulk to ``0.6.1``.
- Rename ``properCount`` to ``proper_count``
- Avoid crash when using ``-p``/``-V`` option with ``--yaml`` and ``yaml`` module is not available.

2.0a4 (2015-11-09)
------------------

- Add ``-p``/``-V`` options to display properties and values that can be guessed.


2.0a3 (2015-11-08)
------------------

- Allow rebulk customization in API module.

2.0a2 (2015-11-07)
------------------

- Raise TypeError instead of AssertionError when non text is given to guessit API.
- Fix packaging issues with previous release blocking installation.

2.0a1 (2015-11-07)
------------------

- Rewrite from scratch using Rebulk.
- Read MIGRATION.rst for migration guidelines.

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
guessit-3.0.3.tar.gz (137.8 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page