Skip to main content

Remove advertisements from subtitle files

Project description

Subscleaner

subscleaner

PyPI version codecov docker CI

Subscleaner is a Python script that removes advertisements from subtitle files. It's designed to help you enjoy your favorite shows and movies without the distraction of unwanted ads in the subtitles.

Features

  • Removes a predefined list of advertisement patterns from subtitle files.
  • Supports various subtitle formats through the pysrt library.
  • Automatically detects the encoding of subtitle files using chardet.
  • Available as a Docker image for easy deployment and usage.

Installation

Automatic installation

To install with pip:

sudo pip install subscleaner

Manual installation

To install Subscleaner, you'll need Python 3.9 or higher. It's recommended to use Poetry for managing the project dependencies.

  1. Clone the repository:
git clone https://gitlab.com/rogs/subscleaner.git
  1. Navigate to the project directory:
cd subscleaner
  1. Install the dependencies with Poetry:
poetry install

Docker

Subscleaner is also available as a Docker image. You can pull the image from Docker Hub:

docker pull rogsme/subscleaner

Usage

If you installed the package automatically, you can pipe a list of subtitle filenames into the script:

find /your/media/location -name "*.srt" | subscleaner

If you installed the package manually:

find /your/media/location -name "*.srt" | poetry run subscleaner

Alternatively, you can use the script directly if you've installed the dependencies globally:

find /your/media/location -name "*.srt" | python3 subscleaner.py

Docker

To use the Docker image, you can run the container with the following command:

docker run -e CRON="0 0 * * *" -v /your/media/location:/files -v /etc/localtime:/etc/localtime:ro rogsme/subscleaner
  • Replace 0 0 * * * with your desired cron schedule for running the script.
  • Replace /your/media/location with the path to your media directory containing the subtitle files.

The Docker container will run the Subscleaner script according to the specified cron schedule and process the subtitle files in the mounted media directory.

Database Persistence in Docker

By default, the Docker container uses an internal database that will be lost when the container is removed. To maintain a persistent database across container restarts, you should mount a volume for the database:

docker run -e CRON="0 0 * * *" \
  -v /your/media/location:/files \
  -v /path/for/database:/data \
  -v /etc/localtime:/etc/localtime:ro \
  rogsme/subscleaner

If you are using YAMS

YAMS is a highly opinionated media server. You can know more about it here: https://yams.media/

Add this container to your docker-compose.custom.yaml:

  subscleaner:
    image: rogsme/subscleaner
    environment:
      - CRON=0 0 * * *
    volumes:
      - ${MEDIA_DIRECTORY}:/files
      - ./subscleaner-data:/data
      - /etc/localtime:/etc/localtime:ro

This ensures that the database is preserved between container restarts, preventing unnecessary reprocessing of subtitle files.

To get more information on how to add your own containers in YAMS: https://yams.media/advanced/add-your-own-containers/

Contributing

Contributions are welcome! If you have any suggestions or improvements, feel free to fork the repository and submit a pull request.

License

Subscleaner is licensed under the GNU General Public License v3.0 or later. See the LICENSE file for more details.

Acknowledgments

This repository is a rewrite of this Github repository: https://github.com/FraMecca/subscleaner.

Thanks to FraMecca in Github!

Database and Caching

Subscleaner now uses a SQLite database to track processed files, which significantly improves performance by avoiding redundant processing of unchanged subtitle files.

How it works

  1. When Subscleaner processes a subtitle file, it generates an MD5 hash of the file content.
  2. This hash is stored in a SQLite database along with the file path.
  3. On subsequent runs, Subscleaner checks if the file has already been processed by comparing the current hash with the stored hash.
  4. If the file hasn't changed, it's skipped, saving processing time.

Database Location

The SQLite database is stored in the following locations, depending on your operating system:

  • Linux: ~/.local/share/subscleaner/subscleaner.db
  • macOS: ~/Library/Application Support/subscleaner/subscleaner.db
  • Windows: C:\Users\<username>\AppData\Local\subscleaner\subscleaner\subscleaner.db

Command Line Options

Several command line options are available:

  • --db-location: Specify a custom location for the database file
  • --force: Processes all files regardless of whether they've been processed before
  • --reset-db: Reset the database (remove all stored file hashes)
  • --list-patterns: List all advertisement patterns being used
  • --version: Show version information and exit
  • -v, --verbose: Increase output verbosity (show analyzing/skipping messages)

Example usage:

find /your/media/location -name "*.srt" | subscleaner --force
find /your/media/location -name "*.srt" | subscleaner --db-location /path/to/custom/database.db
find /your/media/location -name "*.srt" | subscleaner --verbose

This feature makes Subscleaner more efficient, especially when running regularly via cron jobs or other scheduled tasks, as it will only process new or modified subtitle files.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subscleaner-2.1.5.tar.gz (19.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

subscleaner-2.1.5-py3-none-any.whl (19.8 kB view details)

Uploaded Python 3

File details

Details for the file subscleaner-2.1.5.tar.gz.

File metadata

  • Download URL: subscleaner-2.1.5.tar.gz
  • Upload date:
  • Size: 19.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for subscleaner-2.1.5.tar.gz
Algorithm Hash digest
SHA256 5071bfb0faca6f29913c3030b55020059d94563706107da4917ad630eb8ac8ac
MD5 c432d08d1a56a017e8b10a3e98acebda
BLAKE2b-256 57658c436d365fba847c478f574dd730bf534f515d4d6368dcde3422e0ad7671

See more details on using hashes here.

File details

Details for the file subscleaner-2.1.5-py3-none-any.whl.

File metadata

  • Download URL: subscleaner-2.1.5-py3-none-any.whl
  • Upload date:
  • Size: 19.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for subscleaner-2.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 64415c5ef882dce8bce9a01c5956d3d062190ae039d88177252c174e75d33b6a
MD5 727753b95c2de7db5e54cdbf3a9250ba
BLAKE2b-256 eeff7fdb9be08d97eb14b9dc99ce675e2c138149b93fd24b04787c2e9b8e117c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page