Automatic subtitle generation using Gemini AI
Project description
Simple Auto Subtitle (Coauthor: Claude)
Automatic subtitle generation tool that extracts audio from video files and generates subtitles using Google's Gemini AI models.
Features
- Audio Extraction: Automatically extracts audio from video files using ffmpeg
- AI Transcription: Uses Gemini 2.5 models (Flash or Pro) for accurate speech-to-text with precise timestamps
- Multiple Output Formats:
.srtsubtitle files- Videos with embedded subtitles (toggleable)
- Videos with burnt-in subtitles (permanent)
- Extracted audio files (
.wav)
- Model Selection: Choose between Gemini Flash 2.5 (faster) or Pro 2.5 (more accurate)
- Structured Output: Uses Pydantic models for reliable JSON parsing
- Environment Variables: Supports
.envfiles for API key management
Installation
From PyPI (Recommended)
pip install simple-auto-subtitle
From Source
git clone https://github.com/yunfanye/auto_subtitle.git
cd auto_subtitle
pip install -e .
Development Installation
git clone https://github.com/yunfanye/auto_subtitle.git
cd auto_subtitle
pip install -r requirements.txt
pip install -e .
Setup
-
Install ffmpeg (required for audio/video processing):
- macOS:
brew install ffmpeg - Ubuntu:
sudo apt install ffmpeg - Windows: Download from ffmpeg.org
- macOS:
-
Set up Gemini API Key: Create a
.envfile in your working directory:GEMINI_API_KEY=your_api_key_here
Usage
Basic Usage
# Process default video (test.mp4) with Flash model
simple-auto-subtitle
# Process specific video file
simple-auto-subtitle my_video.mp4
# Use Pro model for better accuracy
simple-auto-subtitle my_video.mp4 --model pro
# Alternative command
auto-subtitle my_video.mp4
Command Line Options
simple-auto-subtitle [video_file] [options]
Arguments:
video_file Video file to process (default: test.mp4)
Options:
--api-key API_KEY Gemini API key (or set GEMINI_API_KEY env var)
--output, -o OUTPUT Output SRT file path (default: video_name.srt)
--model {flash,pro} Gemini model: flash (faster) or pro (more accurate)
Output Files
For input video my_video.mp4, the tool generates:
my_video.srt- Standard subtitle filemy_video.wav- Extracted audio filemy_video_embedded.mp4- Video with embedded subtitles (can be toggled on/off)my_video_captioned.mp4- Video with burnt-in subtitles (always visible)
Model Comparison
| Model | Speed | Accuracy | Cost | Best For |
|---|---|---|---|---|
| Flash | Fast | Good | Lower | Quick processing, bulk videos |
| Pro | Slower | Better | Higher | High-quality transcription, important content |
Requirements
- Python 3.8+
- ffmpeg
- Google Gemini API key
- Required Python packages:
google-genai,pydantic,python-dotenv
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file simple_auto_subtitle-0.1.0.tar.gz.
File metadata
- Download URL: simple_auto_subtitle-0.1.0.tar.gz
- Upload date:
- Size: 68.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9a6d70d86531ecb64500b1f9ec209d7a5a4f5b157455e8892308fa6287d2690
|
|
| MD5 |
bfcf90af3cdadc176fb58ccecdc192c7
|
|
| BLAKE2b-256 |
a1a509b26efdbaf7c9053937a4b5a384b473dbd69f6788f73f1ad6f4ad3074b1
|
File details
Details for the file simple_auto_subtitle-0.1.0-py3-none-any.whl.
File metadata
- Download URL: simple_auto_subtitle-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ce64160ac1f34bef88424bc3827dc633adc88a268fce83dcdef104c98a6a1e8
|
|
| MD5 |
7117c69ca48b5f596f087a91fee1b75d
|
|
| BLAKE2b-256 |
70f6c0ad6f120e8294faff437d9b6484767482909b68230424491e63915e722f
|