Skip to main content

Remove advertisements from subtitle files

Project description

Subscleaner

subscleaner

PyPI version codecov docker CI

Subscleaner is a Python script that removes advertisements from subtitle files. It's designed to help you enjoy your favorite shows and movies without the distraction of unwanted ads in the subtitles.

Features

  • Removes a predefined list of advertisement patterns from subtitle files.
  • Supports various subtitle formats through the pysrt library.
  • Automatically detects the encoding of subtitle files using chardet.
  • Available as a Docker image for easy deployment and usage.

Installation

Automatic installation

To install with pip:

sudo pip install subscleaner

Manual installation

To install Subscleaner, you'll need Python 3.9 or higher. It's recommended to use Poetry for managing the project dependencies.

  1. Clone the repository:
git clone https://gitlab.com/rogs/subscleaner.git
  1. Navigate to the project directory:
cd subscleaner
  1. Install the dependencies with Poetry:
poetry install

Docker

Subscleaner is also available as a Docker image. You can pull the image from Docker Hub:

docker pull rogsme/subscleaner

Usage

If you installed the package automatically, you can pipe a list of subtitle filenames into the script:

find /your/media/location -name "*.srt" | subscleaner

If you installed the package manually:

find /your/media/location -name "*.srt" | poetry run subscleaner

Alternatively, you can use the script directly if you've installed the dependencies globally:

find /your/media/location -name "*.srt" | python3 subscleaner.py

Docker

To use the Docker image, you can run the container with the following command:

docker run -e CRON="0 0 * * *" -v /your/media/location:/files -v /etc/localtime:/etc/localtime:ro rogsme/subscleaner
  • Replace 0 0 * * * with your desired cron schedule for running the script.
  • Replace /your/media/location with the path to your media directory containing the subtitle files.

The Docker container will run the Subscleaner script according to the specified cron schedule and process the subtitle files in the mounted media directory.

Database Persistence in Docker

By default, the Docker container uses an internal database that will be lost when the container is removed. To maintain a persistent database across container restarts, you should mount a volume for the database:

docker run -e CRON="0 0 * * *" \
  -v /your/media/location:/files \
  -v /path/for/database:/data \
  -v /etc/localtime:/etc/localtime:ro \
  rogsme/subscleaner

If you are using YAMS

YAMS is a highly opinionated media server. You can know more about it here: https://yams.media/

Add this container to your docker-compose.custom.yaml:

  subscleaner:
    image: rogsme/subscleaner
    environment:
      - CRON=0 0 * * *
    volumes:
      - ${MEDIA_DIRECTORY}:/files
      - ./subscleaner-data:/data
      - /etc/localtime:/etc/localtime:ro

This ensures that the database is preserved between container restarts, preventing unnecessary reprocessing of subtitle files.

To get more information on how to add your own containers in YAMS: https://yams.media/advanced/add-your-own-containers/

Contributing

Contributions are welcome! If you have any suggestions or improvements, feel free to fork the repository and submit a pull request.

License

Subscleaner is licensed under the GNU General Public License v3.0 or later. See the LICENSE file for more details.

Acknowledgments

This repository is a rewrite of this Github repository: https://github.com/FraMecca/subscleaner.

Thanks to FraMecca in Github!

Database and Caching

Subscleaner now uses a SQLite database to track processed files, which significantly improves performance by avoiding redundant processing of unchanged subtitle files.

How it works

  1. When Subscleaner processes a subtitle file, it generates an MD5 hash of the file content.
  2. This hash is stored in a SQLite database along with the file path.
  3. On subsequent runs, Subscleaner checks if the file has already been processed by comparing the current hash with the stored hash.
  4. If the file hasn't changed, it's skipped, saving processing time.

Database Location

The SQLite database is stored in the following locations, depending on your operating system:

  • Linux: ~/.local/share/subscleaner/subscleaner.db
  • macOS: ~/Library/Application Support/subscleaner/subscleaner.db
  • Windows: C:\Users\<username>\AppData\Local\subscleaner\subscleaner\subscleaner.db

Command Line Options

Several command line options are available:

  • --db-location: Specify a custom location for the database file
  • --force: Processes all files regardless of whether they've been processed before
  • --reset-db: Reset the database (remove all stored file hashes)
  • --list-patterns: List all advertisement patterns being used
  • --version: Show version information and exit
  • -v, --verbose: Increase output verbosity (show analyzing/skipping messages)

Example usage:

find /your/media/location -name "*.srt" | subscleaner --force
find /your/media/location -name "*.srt" | subscleaner --db-location /path/to/custom/database.db
find /your/media/location -name "*.srt" | subscleaner --verbose

This feature makes Subscleaner more efficient, especially when running regularly via cron jobs or other scheduled tasks, as it will only process new or modified subtitle files.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subscleaner-2.1.3.tar.gz (19.1 kB view details)

Uploaded Source

Built Distribution

subscleaner-2.1.3-py3-none-any.whl (19.8 kB view details)

Uploaded Python 3

File details

Details for the file subscleaner-2.1.3.tar.gz.

File metadata

  • Download URL: subscleaner-2.1.3.tar.gz
  • Upload date:
  • Size: 19.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for subscleaner-2.1.3.tar.gz
Algorithm Hash digest
SHA256 244820f5541cfd38aaa78da85315674c90a6a146c1209bf2af0519995b013bdd
MD5 38e9a138db84625cefa5da82b67019d1
BLAKE2b-256 510b08680670de80ed9130369518bafd73bfef09d7b8b843a03648049d219a7a

See more details on using hashes here.

File details

Details for the file subscleaner-2.1.3-py3-none-any.whl.

File metadata

  • Download URL: subscleaner-2.1.3-py3-none-any.whl
  • Upload date:
  • Size: 19.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for subscleaner-2.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6144a9a9a92ddd7cef700da4e9892583b5bb2c997439b357e5a3e9d9a41dd2be
MD5 d84c689739328cf393a82f74a8ff0fa9
BLAKE2b-256 2e2d3c5a2f4b850cfd0207cc2421f71d7dde8f8956761a18290998c37649db4e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page