Skip to main content

AI-powered subtitle generation from video/audio using Whisper.

Project description

Subtitle Generator

AI-powered subtitle generation using Whisper for accurate speech-to-text transcription.

PyPI version License: MIT Python 3.9+

Features

  • 🎯 Multi-format output - VTT, SRT, TXT, JSON, LRC, ASS, TTML
  • 🚀 Fast processing - Powered by whisper.cpp for high-performance inference
  • 📦 Batch processing - Process multiple videos at once
  • 🔄 Video embedding - Embed subtitles directly into videos
  • 🌍 Multilingual - Support for multiple languages

Installation

pip install subtitle-generator

Prerequisites

This package shells out to the whisper.cpp whisper-cli binary. It is not bundled in the wheel (whisper.cpp is per-OS native code), so you need to provide it once.

  • FFmpeg is required for video/audio processing:

    # macOS
    brew install ffmpeg
    
    # Ubuntu/Debian
    sudo apt install ffmpeg
    
    # Windows (via chocolatey)
    choco install ffmpeg
    
  • whisper-cli (the whisper.cpp transcription binary). The recommended setup is one command, run once after pip install:

    subtitle setup-whisper
    

    This clones the project's compatible whisper.cpp fork into your user data directory (e.g. ~/Library/Application Support/subtitle-generator/ on macOS) and builds it with CMake. Every subsequent subtitle <video> invocation auto-discovers the binary — no env vars, no PATH editing.

    Build prerequisites for setup-whisper:

    # macOS
    xcode-select --install     # provides clang++ + make
    brew install cmake ffmpeg
    
    # Linux (Debian/Ubuntu)
    sudo apt-get install -y build-essential cmake git ffmpeg
    
    # Windows
    # Install Visual Studio Build Tools, CMake, Git for Windows.
    

    The CLI auto-discovers the binary in this order:

    1. --whisper-binary /path/to/whisper-cli
    2. SUBTITLE_WHISPER_BINARY environment variable
    3. The binary installed by subtitle setup-whisper (preferred over PATH)
    4. whisper-cli / whisper-cpp / main on your PATH
    5. ./binary/whisper-cli relative to the current directory (legacy)

    Note: Homebrew's whisper-cpp formula (currently 1.8.4) dropped the -vi flag and only accepts pre-extracted audio (flac/mp3/ogg/wav), so it is not sufficient for this package's "pass a video, get subtitles" UX. Use subtitle setup-whisper to build a compatible binary instead.

Quick Start

# Generate subtitles (VTT format) into the current directory
subtitle video.mp4

# Generate SRT format
subtitle video.mp4 --format srt

# Embed subtitles into video
subtitle video.mp4 --merge

# Use a larger model for better accuracy
subtitle video.mp4 --model large

# Route output somewhere other than the current directory
subtitle /path/to/video.mp4 --output-dir ~/subs

Output lands in the directory you ran the command from (or --output-dir if you pass it), regardless of where the input video lives.

CLI Commands

Command Description
subtitle <video> Generate subtitles for a video
subtitle models --list List available Whisper models
subtitle models --download <model> Download a specific model
subtitle batch --input-dir <dir> Batch process multiple videos
subtitle formats Show supported output formats

Options

Option Description
--model, -m Model to use: tiny, base, small, medium, large
--format, -f Output format: vtt, srt, txt, json, lrc
--merge Embed subtitles into the video file
--output-dir, -o Where to write the subtitle (default: current dir)
--threads, -t Number of processing threads
--whisper-binary Override the auto-discovered whisper-cli path
--models-dir Override the model cache directory
--verbose, -v Enable verbose output

Python API

from subtitle_generator.core import SubtitleGenerator, WhisperCppTranscriber
from subtitle_generator.models import ModelManager

transcriber = WhisperCppTranscriber(binary_path="./binary/whisper-cli")
generator = SubtitleGenerator(transcriber=transcriber, model_manager=ModelManager())

result = generator.generate(
    input_path="video.mp4",
    model_name="base",
    output_format="srt",
    output_dir="data",
)
print(f"Subtitles saved to: {result.output_path}")

Models

Model Size Speed Accuracy
tiny ~75MB ⚡⚡⚡⚡ ⭐⭐
base ~140MB ⚡⚡⚡ ⭐⭐⭐
small ~460MB ⚡⚡ ⭐⭐⭐⭐
medium ~1.5GB ⭐⭐⭐⭐⭐
large ~3GB 🐢 ⭐⭐⭐⭐⭐

Tip: Use .en models (e.g., base.en) for English-only content for faster processing.

Links

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subtitle_generator-3.0.4.tar.gz (47.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

subtitle_generator-3.0.4-py3-none-any.whl (49.8 kB view details)

Uploaded Python 3

File details

Details for the file subtitle_generator-3.0.4.tar.gz.

File metadata

  • Download URL: subtitle_generator-3.0.4.tar.gz
  • Upload date:
  • Size: 47.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for subtitle_generator-3.0.4.tar.gz
Algorithm Hash digest
SHA256 55ca14df791cbed0616b8efc0d85c3ed2747aba7fec55ee6b9b0c6f5adc999d9
MD5 0aa4ad9672c3591eea8fb11db9b180a1
BLAKE2b-256 53af6048195c098fd51aee279e26781471cd3bd57c52573f1e844fe86499be8a

See more details on using hashes here.

File details

Details for the file subtitle_generator-3.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for subtitle_generator-3.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 049267abd07490b31d411e190dd0c15b4b03f413778fcbfc75cb538cb6c1cbd9
MD5 64ec42cfaefe0e7c6718e63316746040
BLAKE2b-256 2145710eb0c65814ab2e27aba844a6d90b3c9328390819ec4c223fa77dbbf8bb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page