Skip to main content

Automatic subtitle generation using Gemini AI

Project description

Simple Auto Subtitle (Coauthor: Claude)

Automatic subtitle generation tool that extracts audio from video files and generates subtitles using Google's Gemini AI models.

Features

  • Audio Extraction: Automatically extracts audio from video files using ffmpeg
  • AI Transcription: Uses Gemini 2.5 models (Flash or Pro) for accurate speech-to-text with precise timestamps
  • Multiple Output Formats:
    • .srt subtitle files
    • Videos with embedded subtitles (toggleable)
    • Videos with burnt-in subtitles (permanent)
    • Extracted audio files (.wav)
  • Model Selection: Choose between Gemini Flash 2.5 (faster) or Pro 2.5 (more accurate)
  • Structured Output: Uses Pydantic models for reliable JSON parsing
  • Environment Variables: Supports .env files for API key management

Installation

From PyPI (Recommended)

pip install simple-auto-subtitle

From Source

git clone https://github.com/yunfanye/auto_subtitle.git
cd auto_subtitle
pip install -e .

Development Installation

git clone https://github.com/yunfanye/auto_subtitle.git
cd auto_subtitle
pip install -r requirements.txt
pip install -e .

Setup

  1. Install ffmpeg (required for audio/video processing):

    • macOS: brew install ffmpeg
    • Ubuntu: sudo apt install ffmpeg
    • Windows: Download from ffmpeg.org
  2. Set up Gemini API Key: Create a .env file in your working directory:

    GEMINI_API_KEY=your_api_key_here
    

Usage

Basic Usage

# Process default video (test.mp4) with Flash model
simple-auto-subtitle

# Process specific video file  
simple-auto-subtitle my_video.mp4

# Use Pro model for better accuracy
simple-auto-subtitle my_video.mp4 --model pro

# Alternative command
auto-subtitle my_video.mp4

Command Line Options

simple-auto-subtitle [video_file] [options]

Arguments:
  video_file              Video file to process (default: test.mp4)

Options:
  --api-key API_KEY       Gemini API key (or set GEMINI_API_KEY env var)
  --output, -o OUTPUT     Output SRT file path (default: video_name.srt)
  --model {flash,pro}     Gemini model: flash (faster) or pro (more accurate)

Output Files

For input video my_video.mp4, the tool generates:

  • my_video.srt - Standard subtitle file
  • my_video.wav - Extracted audio file
  • my_video_embedded.mp4 - Video with embedded subtitles (can be toggled on/off)
  • my_video_captioned.mp4 - Video with burnt-in subtitles (always visible)

Model Comparison

Model Speed Accuracy Cost Best For
Flash Fast Good Lower Quick processing, bulk videos
Pro Slower Better Higher High-quality transcription, important content

Requirements

  • Python 3.8+
  • ffmpeg
  • Google Gemini API key
  • Required Python packages: google-genai, pydantic, python-dotenv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simple_auto_subtitle-0.1.0.tar.gz (68.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

simple_auto_subtitle-0.1.0-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file simple_auto_subtitle-0.1.0.tar.gz.

File metadata

  • Download URL: simple_auto_subtitle-0.1.0.tar.gz
  • Upload date:
  • Size: 68.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for simple_auto_subtitle-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b9a6d70d86531ecb64500b1f9ec209d7a5a4f5b157455e8892308fa6287d2690
MD5 bfcf90af3cdadc176fb58ccecdc192c7
BLAKE2b-256 a1a509b26efdbaf7c9053937a4b5a384b473dbd69f6788f73f1ad6f4ad3074b1

See more details on using hashes here.

File details

Details for the file simple_auto_subtitle-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for simple_auto_subtitle-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3ce64160ac1f34bef88424bc3827dc633adc88a268fce83dcdef104c98a6a1e8
MD5 7117c69ca48b5f596f087a91fee1b75d
BLAKE2b-256 70f6c0ad6f120e8294faff437d9b6484767482909b68230424491e63915e722f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page