Skip to main content

Subtitle processing toolkit: merge, sync, fix overlaps, and apply corrections

Project description

SubtitleKit - Subtitle Processing Toolkit

PyPI version Python 3.8+ License: MIT

Comprehensive Python library and desktop application for subtitle processing, synchronization, and correction.

✨ Features

  • Merge & Sync: Combine subtitle files with automatic synchronization
  • Fix Overlaps: Detect and correct timing issues and overlaps
  • Apply Corrections: Apply text corrections from JSON files
  • Subtitle Optimizer: Automatic CPS optimization, line reduction, filler word removal, and AI-powered shortening (via Gemini API)
  • LLM Integration: Generate optimized JSON for translation workflows
  • Desktop App: Cross-platform GUI (Windows, macOS, Linux)
  • Colab Ready: Works seamlessly in Google Colab notebooks

🚀 Quick Start

Installation

pip install subtitlekit

Google Colab

Open In Colab

# Install
!pip install subtitlekit

# Launch UI
from subtitlekit.ui import show_ui
show_ui(lang='en')  # or 'el' for Greek

As a Library

from subtitlekit import merge_subtitles, fix_overlaps, apply_corrections

# Merge subtitle files
merge_subtitles("original.srt", ["helper.srt"], "output.json")

# Fix timing overlaps
fix_overlaps("input.srt", "reference.srt", "fixed.srt")

# Apply corrections from JSON
apply_corrections("input.srt", "corrections.json", "output.srt")

CLI Usage

# Merge subtitles
subtitlekit merge --original original.srt --helper helper.srt --output output.json

# Fix overlaps
subtitlekit overlaps --input input.srt --reference ref.srt --output fixed.srt

# Apply corrections
subtitlekit corrections --input input.srt --corrections fixes.json --output corrected.srt

# Optimize subtitles
subtitlekit optimize input.srt --interjections --line-reduction --cps --llm

Desktop App

Download the standalone application from Releases.

Or launch programmatically:

python -m subtitlekit.ui.desktop

📖 Documentation

Merge Subtitles

Combines original subtitle file with one or more helper files (different languages) to create JSON output optimized for LLM translation workflows.

subtitlekit merge \
  --original movie.srt \
  --helper helpful_en.srt \
  --helper helpful_pt.srt \
  --output for_translation.json \
  --skip-sync  # optional: skip ffsubsync

Output format:

{
  "id": 1,
  "t": "00:00:11,878 --> 00:00:16,130",
  "trans": "Original text to translate",
  "h1": "Helper text (language 1)",
  "h2": "Helper text (language 2)"
}

Fix Overlaps

Detects and corrects timing issues:

  • Overlapping timestamps
  • Out-of-order entries
  • Unreasonable durations
  • Duplicate timings
subtitlekit overlaps \
  --input problematic.srt \
  --reference correct_timings.srt \
  --output fixed.srt \
  --window 5

Apply Corrections

Apply text corrections from JSON file:

subtitlekit corrections \
  --input subtitle.srt \
  --corrections fixes.json \
  --output corrected.srt

Subtitle Optimizer

Automated pipeline to improve subtitle readability and timing. Supports multiple languages (English, Greek, etc.) for interjection removal.

subtitlekit optimize movie.srt \
  --interjections \
  --line-reduction \
  --cps \
  --llm \
  --api-key YOUR_GEMINI_API_KEY

Optimization Phases:

  1. Interjection Removal: Removes filler words (e.g., "Uh,", "Well,").
  2. Line Reduction: Collapses 3+ line entries into 2 lines for better layout.
  3. CPS Optimization: Extends timing and merges short subtitles to hit target Characters Per Second (default: 20.0).
  4. LLM Shortening: Uses Gemini AI to shorten entries that still exceed CPS limits.
  5. Translation Prep: Experimental --simplify flag to simplify text for easier translation.

🌍 I18n Support

Desktop and Colab UIs support:

  • 🇬🇧 English
  • 🇬🇷 Greek (Ελληνικά)

📦 Development

# Clone repository
git clone https://github.com/angelospk/subtitlekit.git
cd subtitlekit

# Install in development mode
pip install -e .

# Install pre-commit hooks (Mandatory for contributing)
pip install pre-commit
pre-commit install

# Run tests manually
pytest -v

🤝 Contributing

Contributions are welcome! Please ensure all tests pass before submitting a Pull Request. We use pre-commit to enforce running pytest automatically before every commit to prevent breaking changes.

📄 License

MIT License - see LICENSE file.

🙏 Credits

Built by angelospk

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subtitlekit-0.2.2.tar.gz (77.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

subtitlekit-0.2.2-py3-none-any.whl (76.3 kB view details)

Uploaded Python 3

File details

Details for the file subtitlekit-0.2.2.tar.gz.

File metadata

  • Download URL: subtitlekit-0.2.2.tar.gz
  • Upload date:
  • Size: 77.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for subtitlekit-0.2.2.tar.gz
Algorithm Hash digest
SHA256 5def3e0432fd8773cb39b897e84cf3c24e6a51536cccfdf35efb4f7195ddd4e2
MD5 172757b36ef8c80353efb866125ce26d
BLAKE2b-256 a656f478097795f684f3207cd7a6988adf2d6279f1ea0f6453023dbbc0561a74

See more details on using hashes here.

File details

Details for the file subtitlekit-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: subtitlekit-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 76.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for subtitlekit-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6174af5e095cfacdce9fd1b03b48104dbc9b6b9f5d96cdf4098dbdb196fe083d
MD5 5e4d7281643e7fc73c89bd37e71fcdb5
BLAKE2b-256 d3592c831c368399353ac356fee92396dca29932a4480d40f41f0052d0030582

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page