Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

Subtitles extremely clean

Project Description

Subtitles extremely clean.

Project page:

CleanIt is a command line tool (written in python) that helps you to keep your subtitles clean. You can specify rules to detect subtitle entries to be removed or patterns to be replaced. Simple text matching or complex regex can be used.



Clean subtitles:

$ cleanit --config my-config.yml
Collected 1 subtitles
Saving <Subtitle []>
Saved <Subtitle []>


How to clean subtitles in a specific path using a specific configuration:

from cleanit.api import clean_subtitle, save_subtitle
from cleanit.config import Config
from cleanit.subtitle import Subtitle

subtitle = Subtitle('/subtitle/path')
config = Config.from_file('/config/path')
if clean_subtitle(subtitle, config.rules):

YAML Configuration file

The yaml configuration file has 2 main sections: templates and groups.

  • Templates can help you to define common configuration snippets to be used in several groups.
  • Groups: where you can define your rules.
# Reference:
#   type: [text*, regex]
#   match: [contains*, exact, startswith, endswith]
#   flags: [ignorecase, dotall, multiline, locale, unicode, verbose]
#   whitelist: no*
#   rules:
#   - sometext
#   - (\b)(\d{1,2})x(\d{1,2})(\b): {replacement: \1S\2E\3\4, type: regex, match: contains, flags: [unicode], whitelist: no}

    type: text
    match: contains

  # Groups can have any name, in this case 'blacklist' we have all the rules to remove subtitle  entries
    template: common
      # Removes any subtitle entry that contains the word FooBar
      - FooBar

      # Removes any subtitle entry that contains the pattern S00E00
      # Example:
      #   My Series S01E02
      - \bs\d{2}\s?e\d{2}\b: {type: regex, flags: ignorecase}

      # Removes any subtitle entry that is exactly the word: 'Ah' or 'Oh' (with 1 or more h)
      # Example:
      #   Ohhh!
      - ((Ah+)|(Oh+))\W?: {match: exact}

  # The group 'tidy' has all rules to replace certain patterns in your subtitles.
    template: common
    type: regex
      # Description: Replace extra spaces to a single space
      # Example:
      #   Foo     bar.
      # to
      #   Foo bar.
      - \s{2,}: ' '

      # Description: Add space when starting phrase with '-'. It ignores tags, such as <i>, <b>
      # Example:
      #   <i>-Francine, what has happened?
      #   -What has happened? You tell me!</i>
      # to
      #   <i>- Francine, what has happened?
      #   - What has happened? You tell me!</i>
      - '(?:^(|(?:\<\w\>)))-([''"]?\w+)': { replacement: '\1- \2', flags: [multiline, unicode] }

* The default value if none is defined

CleanIt will try to load configuration file from ~/.config/cleanit/config.yml if no configuration file is defined.



release date: 2016-02-28 * Adding guess encoding back without python-magic dependency.


release date: 2016-02-27 * Removing chardet and python-magic dependencies. Either encoding is specified or it should be guessed by pysrt


release date: 2015-10-16

  • Initial release

Release History

This version
History Node


History Node


History Node


History Node


Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, Size & Hash SHA256 Hash Help File Type Python Version Upload Date
(11.6 kB) Copy SHA256 Hash SHA256
Source None Feb 28, 2016

Supported By

Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Google Google Cloud Servers DreamHost DreamHost Log Hosting