A robust Python toolkit powered by Google's Gemini API for converting video content into accurate, multilingual subtitles.
Project description
sub-tools 🎬
A robust Python toolkit for converting video/audio content into accurate, multilingual subtitles using WhisperX for transcription and Google's Gemini API for proofreading and translation.
✨ Features
- 🎯 High-quality transcription using WhisperX with word-level alignment
- 🔍 AI-powered proofreading with Gemini to fix transcription errors
- 🌍 Multilingual translation support
- 📥 Support for HLS streams, direct file URLs, and local files
- 🎵 Audio fingerprinting using Shazam (macOS only)
- 📊 Progress tracking with rich terminal output
🚀 Quick Start
Prerequisites
- Python 3.10 or higher
- FFmpeg installed on your system
Installation
pip install sub-tools
Usage
export GEMINI_API_KEY={your_api_key}
# Full pipeline: download video, extract audio, transcribe, proofread, and translate
sub-tools -i https://example.com/video.mp4 --languages en es fr
# Using HLS stream URL
sub-tools -i https://example.com/hls/video.m3u8 --languages en es fr
# Using local audio file (skip video/audio tasks)
sub-tools --tasks transcribe translate --audio-file audio.mp3 --languages en es fr
# Only transcribe without translation
sub-tools --tasks transcribe --audio-file audio.mp3 --languages en
# Specify custom tasks (available: video, audio, signature, transcribe, translate)
sub-tools -i https://example.com/video.mp4 --tasks video audio transcribe translate --languages en es
# Specify a custom Gemini model (default: gemini-3-pro-preview)
sub-tools -i https://example.com/video.mp4 --languages en --model gemini-2.5-pro
# Specify output directory (default: output)
sub-tools -i https://example.com/video.mp4 --languages en --output my-subtitles
Pipeline Tasks
The tool operates as a multi-stage pipeline controlled by the --tasks parameter:
- video: Downloads media from URL (HLS or direct) →
video.mp4 - audio: Extracts audio track →
audio.mp3 - signature: Generates Shazam signature for fingerprinting (macOS only)
- transcribe: Transcription using WhisperX →
transcript.srt - translate: Proofreads and translates to target languages using Gemini →
{language}.srt
By default, all tasks run. You can customize which tasks to run with --tasks.
Build Docker
docker build -t sub-tools .
docker run -v $(pwd)/output:/app/output sub-tools sub-tools --gemini-api-key GEMINI_API_KEY -i URL -l en
🤝 Contributing
Contributions are welcome! Please see CONTRIBUTING.md for detailed guidelines.
Quick Development Setup
# Install uv package manager
# https://github.com/astral-sh/uv
# Clone and setup
git clone https://github.com/dohyeondk/sub-tools.git
cd sub-tools
uv sync
🧪 Testing
uv run pytest -m "not slow"
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.
⭐ Star History
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sub_tools-0.8.0.tar.gz.
File metadata
- Download URL: sub_tools-0.8.0.tar.gz
- Upload date:
- Size: 10.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48ff406b2a884a28f703c51e1da089da6f482b84021950c98fbbc139e17daddd
|
|
| MD5 |
d88efd04971e6d71041fd8d27f545e7e
|
|
| BLAKE2b-256 |
1d8c5080194c05b97e09e038d776de00b5b9adf6ba7d4f7e03543d346ab41818
|
File details
Details for the file sub_tools-0.8.0-py3-none-any.whl.
File metadata
- Download URL: sub_tools-0.8.0-py3-none-any.whl
- Upload date:
- Size: 15.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
952451081dd87e99714f25fa466a1032cb7e87e09accd0f89051974c4296511f
|
|
| MD5 |
c71a14f0879a6bc3479bfb322690e865
|
|
| BLAKE2b-256 |
1c35fb6513af9eeeea2a32d91fe86a14e2b75668b62a98367e36430e7c69ab55
|