Skip to main content

Clean up SRT subtitle files removing ads and misplaced credits

Project description

SRT Cleaner

A simple tool to clean up SRT subtitle files removing ads and misplaced credits, also fixing their encoding.

Usage

Library

import srtcleaner

srtfiles = ['/data/TVSeries/Cosmos/Cosmos.S01E01.srt',
            '/data/TVSeries/Cosmos/Cosmos.S01E02.srt']

srtcleaner.srtcleaner([srtfiles], in_place=True, backup=False, convert='UTF-8')

Command-line

$ srtcleaner --help

usage: srtcleaner [-h] [-q | -v] [--recursive] [--input-encoding ENCODING]
                  [--input-fallback-encoding FALLBACK_ENCODING]
                  [--convert OUTPUT_ENCODING] [--in-place] [--no-backup]
                  [--no-rebuild-index] [--blacklist BLACKLISTPATH]
                  srtpaths [srtpaths ...]

Clean subtitles deleting items that matches entries in blacklist file. Useful
to remove ads and misplaced credits

positional arguments:
  srtpaths              SRT file(s) or dir(s) to modify

optional arguments:
  -h, --help            show this help message and exit
  -q, --quiet           Suppress informative messages and summary statistics.
  -v, --verbose         Print additional information for each processed file.
  --recursive, -r       recurse inside directories.
  --input-encoding ENCODING, -e ENCODING
                        Encoding used in subtitles, if known. By default tries
                        to autodetect encoding.
  --input-fallback-encoding FALLBACK_ENCODING, -f FALLBACK_ENCODING
                        Fallback encoding to read subtitles if encoding
                        autodetection fails. [Default: windows-1252]
  --convert OUTPUT_ENCODING, -c OUTPUT_ENCODING
                        Convert subtitle encoding. By default output uses the
                        same encoding as the input.
  --in-place, -i        Overwrite original file instead of outputting to
                        standard output
  --no-backup, -B       When using --in-place, do not create a backup file.
  --no-rebuild-index, -I
                        Do not rebuild subtitles indexes after removing items.
                        Resulting SRT will not be strictly valid, although it
                        will work in most players. Useful when debugging for
                        comparing original and modified subtitles
  --blacklist BLACKLISTPATH, -b BLACKLISTPATH
                        Blacklist file path. [Default:
                        /home/user/.config/srtcleaner/srtcleaner.conf]

Copyright (C) 2021 Rodrigo Silva License: GPLv3 or later, at your choice. See
<http://www.gnu.org/licenses/gpl>

$ srtcleaner -v --in-place -B --convert 'UTF-8' '/data/series/Cosmos/Cosmos.S01E01.srt'
[DEBUG] Auto-detected encoding: 'iso-8859-1'
[INFO ] 20      00:00:45,653 --> 00:00:48,842   <b>UNITED       apresenta</b>
[INFO ] 21      00:00:49,270 --> 00:00:52,638   <b>Legenda:     rickSG | .:FGMsp:.</b>
[INFO ] 741     00:46:55,499 --> 00:46:58,557   UNITED  Quality is Everything!
[INFO ] 3 items deleted

$ srtcleaner -v --in-place -B --convert 'UTF-8' '/data/series/Cosmos/Cosmos.S01E01.srt'
[DEBUG] Auto-detected encoding: 'utf-8'

Configuring

SRT Cleaner will remove any entries that matches any record on its blacklist file, located by default at ~/.config/srtcleaner/srtcleaner.conf. Create or edit it before using srtcleaner.

A record can span over multiple lines, so use a blank line to separate each record. Its text is matched against each SRT entry by a simple text in entry comparison, in a case-insensitive way. So if the whole text is found as part of an entry, the whole entry is removed from the SRT file. Escape sequences such as '\n' and '\t' are also interpreted, so you can use '\n' when you want to include a newline at the end of the text to match.

Example of a basic srtcleaner.conf:

OpenSubtitles.org

facebook.com/

fb.com/

@gmail.com

Resync

Legendas:\n

UNITED4EVER

UNITED
Quality is everything

<b>UNITED
Qualidade É Tudo</b>

INSUBS\n

L.O.T.S\n

Requirements

Note: There are (at least) 3 python modules named magic available on PyPI, all wrappers to libmagic, but with very distinct API:

SRT Cleaner supports any of the above libmagic wrappers, and on install pulls the one from the file/libmagic project. And all of them obviously requires libmagic to be installed on your system. It usually ships in the file package and comes pre-installed in most GNU/Linux distributions and MacOS.

On Windows, the following could be used to install libmagic:

Installing

From Git:

git clone https://github.com/MestreLion/strcleaner
cd strcleaner
pip install --user -e .

From PyPi:

pip install --user strcleaner

Contributing

Patches are welcome! Fork, hack, request pull!

If you find a bug or have any enhancement request, please open a new issue

Author

Rodrigo Silva (MestreLion) linux@rodrigosilva.com

License and Copyright

Copyright (C) 2021 Rodrigo Silva (MestreLion) <linux@rodrigosilva.com>.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

srtcleaner-1.0.1.tar.gz (9.7 kB view details)

Uploaded Source

Built Distributions

srtcleaner-1.0.1-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

srtcleaner-1.0.1-py2-none-any.whl (24.8 kB view details)

Uploaded Python 2

File details

Details for the file srtcleaner-1.0.1.tar.gz.

File metadata

  • Download URL: srtcleaner-1.0.1.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.7.0 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/2.7.17

File hashes

Hashes for srtcleaner-1.0.1.tar.gz
Algorithm Hash digest
SHA256 c7a02c8dba3d2beeead906b62ad6f5d53c65f8ee10a78e2be5158db157a03e62
MD5 0fec33c245bbd848750640deda130cbd
BLAKE2b-256 b2f9fa2adae9480a5d4671db968428a51ff8f8632cf55a56ed9214b003067859

See more details on using hashes here.

File details

Details for the file srtcleaner-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: srtcleaner-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.7.0 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/2.7.17

File hashes

Hashes for srtcleaner-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bd98d5fe0bec72458e073f8c5f2872c354d86494eadb0ba5fed83f0841de45be
MD5 61282704e31ac194d214b080ef1b856c
BLAKE2b-256 1471eadc1321bce9cf842ac58f11cea38c26bca0dbf9890e90996ba787aeab4c

See more details on using hashes here.

File details

Details for the file srtcleaner-1.0.1-py2-none-any.whl.

File metadata

  • Download URL: srtcleaner-1.0.1-py2-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.7.0 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/2.7.17

File hashes

Hashes for srtcleaner-1.0.1-py2-none-any.whl
Algorithm Hash digest
SHA256 5501489304c7ee2b5ece128cb2853af870829c765717dfa388f18086b15df3eb
MD5 6a5dba4c7f970897729b407783b71901
BLAKE2b-256 bdbaa1e71f41ceb863392995a79f3f7d2bcc9f74d6df39d50148da6ff9ca01ca

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page