Skip to main content

A robust Python toolkit powered by Google's Gemini API for converting video content into accurate, multilingual subtitles.

Project description

sub-tools 🎬

Python 3.10+ License: MIT

A robust Python toolkit for converting video/audio content into accurate, multilingual subtitles using WhisperX for transcription and Google's Gemini API for proofreading and translation.

✨ Features

  • 🎯 High-quality transcription using WhisperX with word-level alignment
  • 🔍 AI-powered proofreading with Gemini to fix transcription errors
  • 🌍 Multilingual translation support
  • 📥 Support for HLS streams, direct file URLs, and local files
  • 🎵 Audio fingerprinting using Shazam (macOS only)
  • 📊 Progress tracking with rich terminal output

🚀 Quick Start

Prerequisites

  • Python 3.10 or higher
  • FFmpeg installed on your system

Installation

pip install sub-tools

Usage

export GEMINI_API_KEY={your_api_key}

# Full pipeline: download video, extract audio, transcribe, proofread, and translate
sub-tools -i https://example.com/video.mp4 --languages en es fr

# Using HLS stream URL
sub-tools -i https://example.com/hls/video.m3u8 --languages en es fr

# Using local audio file (skip video/audio tasks)
sub-tools --tasks transcribe translate --audio-file audio.mp3 --languages en es fr

# Only transcribe without translation
sub-tools --tasks transcribe --audio-file audio.mp3 --languages en

# Specify custom tasks (available: video, audio, signature, transcribe, translate)
sub-tools -i https://example.com/video.mp4 --tasks video audio transcribe translate --languages en es

# Specify a custom Gemini model (default: gemini-3-pro-preview)
sub-tools -i https://example.com/video.mp4 --languages en --model gemini-2.5-pro

# Specify output directory (default: output)
sub-tools -i https://example.com/video.mp4 --languages en --output my-subtitles

Pipeline Tasks

The tool operates as a multi-stage pipeline controlled by the --tasks parameter:

  1. video: Downloads media from URL (HLS or direct) → video.mp4
  2. audio: Extracts audio track → audio.mp3
  3. signature: Generates Shazam signature for fingerprinting (macOS only)
  4. transcribe: Transcription using WhisperX → transcript.srt
  5. translate: Proofreads and translates to target languages using Gemini → {language}.srt

By default, all tasks run. You can customize which tasks to run with --tasks.

Build Docker

docker build -t sub-tools .
docker run -v $(pwd)/output:/app/output sub-tools sub-tools --gemini-api-key GEMINI_API_KEY -i URL -l en

🤝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for detailed guidelines.

Quick Development Setup

# Install uv package manager
# https://github.com/astral-sh/uv

# Clone and setup
git clone https://github.com/dohyeondk/sub-tools.git
cd sub-tools
uv sync

🧪 Testing

uv run pytest -m "not slow"

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

⭐ Star History

Star History Chart

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sub_tools-0.8.0.tar.gz (10.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sub_tools-0.8.0-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file sub_tools-0.8.0.tar.gz.

File metadata

  • Download URL: sub_tools-0.8.0.tar.gz
  • Upload date:
  • Size: 10.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.14

File hashes

Hashes for sub_tools-0.8.0.tar.gz
Algorithm Hash digest
SHA256 48ff406b2a884a28f703c51e1da089da6f482b84021950c98fbbc139e17daddd
MD5 d88efd04971e6d71041fd8d27f545e7e
BLAKE2b-256 1d8c5080194c05b97e09e038d776de00b5b9adf6ba7d4f7e03543d346ab41818

See more details on using hashes here.

File details

Details for the file sub_tools-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: sub_tools-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.14

File hashes

Hashes for sub_tools-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 952451081dd87e99714f25fa466a1032cb7e87e09accd0f89051974c4296511f
MD5 c71a14f0879a6bc3479bfb322690e865
BLAKE2b-256 1c35fb6513af9eeeea2a32d91fe86a14e2b75668b62a98367e36430e7c69ab55

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page