A set of UNIX-style tools for generating and translating subtitles using AI.
Project description
AI Subtitle Assistant
AI Subtitle Assistant is a command-line tool that uses AI technologies (Whisper and Large Language Models) to generate high-quality subtitles for video and audio files.
Features
- Multi-format Support: Handles various common video and audio file formats.
- High-precision Speech Recognition: Uses OpenAI's Whisper model for accurate audio transcription.
- Intelligent Translation and Correction: Leverages Large Language Models (LLMs) for translation, correcting errors based on context, and identifying proper nouns.
- Flexible LLM Configuration: Allows users to customize the API base URL, key, and model for their LLM provider.
- Model Selection: Choose from different LLM models for translation tasks.
- Concurrent Translation: Processes multiple translation requests concurrently for faster performance.
- Translation Validation: Verifies that the original text returned by the LLM matches the input text to prevent hallucinations.
- Improved Translation Quality: Adjusted importance weights to better balance accuracy and fluency (1:0.6).
- Context Limit Handling: Detects and warns about model context limits that may cause truncated outputs.
- Debug Mode: Enables detailed output of intermediate JSON data for troubleshooting.
- Standard Subtitle Output: Generates standard UTF-8 encoded SRT subtitle files.
- Bilingual Subtitles: Can generate bilingual subtitles for language learning.
- Embedded Subtitle Extraction: Can detect and extract existing subtitle tracks from video files.
- Internationalization: Supports multiple languages for the user interface (currently English and Chinese).
- User-friendly CLI: Provides a clear and easy-to-use command-line interface with subcommands.
Installation
Recommended: Install from PyPI
pip install ai-subtitle
Or install from source:
git clone https://github.com/shdancer/ai-subtitle
cd subtitle
python3 -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
pip install .
ffmpeg is required for audio processing.
- On macOS (using Homebrew):
brew install ffmpeg
- On Debian/Ubuntu:
sudo apt update && sudo apt install ffmpeg
- On Windows (using Chocolatey):
choco install ffmpeg
Usage
After installation, you can use the ai-subtitle command.
ai-subtitle <command> [options]
Global Options
--language {en,zh}: Sets the display language for the tool. Defaults to your system's language.
Commands
transcribe
Transcribes an audio/video file to an SRT file.
Usage:
ai-subtitle transcribe <input_file> [options]
Arguments:
input_file: Path to the input video or audio file.
Options:
-o, --output: Path to the output SRT file. If not specified, prints to standard output.-m, --model: The Whisper model to use (e.g.,tiny,base,small,medium,large). Default isbase.--force-transcribe: Force transcription even if embedded subtitles are found.
Example:
# Transcribe a video and save to a file
ai-subtitle transcribe my_video.mp4 -o my_video.srt
# Extract an embedded subtitle instead of transcribing
ai-subtitle transcribe my_movie.mkv -o movie_subs.srt
translate
Translates an existing SRT file into a bilingual SRT file.
Usage:
ai-subtitle translate [input_file] [options]
Arguments:
input_file: Path to the input SRT file. Reads from standard input if not provided.
Options:
-o, --output: Path to the output bilingual SRT file. Prints to standard output if not specified.-t, --target-language: The target language for translation (e.g., "Chinese", "English"). Default is "Chinese".--model: Select the model to use for translation (e.g., "gpt-3.5-turbo", "gpt-4"). Default is "gpt-3.5-turbo".--max-workers: Maximum number of concurrent translation requests. Default is 5.--list-models: List available models from the API and exit.--api-base-url: Custom base URL for the LLM provider.--api-key: Custom API key for the LLM provider.
Example:
# Transcribe and then translate
ai-subtitle transcribe my_video.mp4 | ai-subtitle translate -t "Japanese" -o bilingual.srt
config
Manages configuration settings for the AI Subtitle Assistant.
Usage:
ai-subtitle config [options]
Options:
--show-path: Show the configuration file path.--create: Create or update configuration interactively.
Example:
# Show the configuration file path
ai-subtitle config --show-path
# Create or update configuration
ai-subtitle config --create
How It Works
- Audio Extraction/Transcription: For the
transcribecommand, it either extracts existing subtitles or usesffmpegto extract audio andwhisperto transcribe it into timed text segments. - Chunking & Translation: For the
translatecommand, it reads an SRT file, chunks the text to fit the LLM's context window, and sends it for translation. - LLM Processing: The text is sent to the configured LLM for translation and refinement. The process includes retries and a progress bar.
- SRT Generation: The final processed text is formatted into a standard
.srtfile, either as a simple transcription or a bilingual subtitle.
Changelog
v0.1.4
- Added: Model selection feature for translation with
--modeloption - Added: List available models with
--list-modelsoption - Added: Debug mode for troubleshooting translation issues (enabled via
AI_SUBTITLE_DEBUG=1environment variable) - Improved: Translation prompt to better handle different language structures and prevent subtitle misalignment
- Added: Graceful exit handling for keyboard interrupts (Ctrl+C)
- Added: Validation to check for missing translations
- Added: Concurrent translation processing for improved performance
- Added: Translation validation to verify original text matches input text
- Improved: Translation quality by adjusting importance weights to better balance accuracy and fluency (1:0.6)
- Added:
--max-workersoption to control the number of concurrent translation requests - Added: Context limit handling to detect and warn about truncated outputs
v0.1.2
- Fixed: SRT multi-line content parsing bug, now all lines are preserved and correctly translated
- Improved: Documentation and README structure
- Updated: PyPI installation instructions and project metadata
v0.1.1
- Fixed issue with pipeline operations where debug output was interfering with standard output
- Improved error handling and messaging
- Updated documentation with installation instructions from PyPI
v0.1.0
- Initial release with core functionality
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_subtitle-0.1.4.tar.gz.
File metadata
- Download URL: ai_subtitle-0.1.4.tar.gz
- Upload date:
- Size: 24.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5d4b74785be2e6312278eb74b8c60d58debd692ca4f2514a3992bf0c408aa7f
|
|
| MD5 |
ea476a07c3a48925fb1d89d70dcd810b
|
|
| BLAKE2b-256 |
10d924e73726c3ed12428d7800d9b293047293e131cae691d9da2b7cf269aaa7
|
File details
Details for the file ai_subtitle-0.1.4-py3-none-any.whl.
File metadata
- Download URL: ai_subtitle-0.1.4-py3-none-any.whl
- Upload date:
- Size: 21.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
377675e1663b4fd74f203cf6700f2eb05d6ae25ae99cebafcd503e035494c15a
|
|
| MD5 |
2cf0fdffaac2be3155ba36cc77f54d02
|
|
| BLAKE2b-256 |
8dc7cea6aea6709c0a1546e6ea877be5e801af0fa4ae4d74d3cf997d02c6131c
|