Skip to main content

Local subtitle generation and translation CLI using Whisper and NLLB-200

Project description

lsub

Local AI-powered subtitle generation and translation

PyPI Version Python Versions License UV Friendly CI Publish

Extract and translate video subtitles using Whisper and NLLB-200.

Features

  • Automatic speech recognition using OpenAI Whisper (auto-detects language)
  • Multi-language translation using Meta's NLLB-200 model
  • Subtitle embedding into video files (MP4 or MKV)
  • Multiple subtitle tracks with proper language naming
  • SRT and ASS format support (ASS for better Unicode/CJK character rendering)

Install

Recommended (fast, reproducible):

uv tool install lsub

Run without installing:

uvx lsub video.mp4 -t en es

With pip:

pip install lsub

Usage

# Basic usage - extract subtitles (auto-detect language)
lsub video.mp4

# Extract and translate to English and Spanish
lsub video.mp4 -t en es

# Specify source language explicitly
lsub video.mp4 -l zh -t en

# Use different Whisper model (default: turbo)
lsub video.mp4 -m large -t en

# Output as MKV (better subtitle support for Unicode/CJK)
lsub video.mp4 -t en zh -f mkv

# Generate SRT files only (no embedding)
lsub video.mp4 -t en es --srt-only

# Custom output path
lsub video.mp4 -t en -o output_with_subs.mp4

Supported Languages

Translation supports 12 common languages:

  • en - English
  • es - Spanish
  • fr - French
  • de - German
  • it - Italian
  • pt - Portuguese
  • ru - Russian
  • ja - Japanese
  • ko - Korean
  • zh - Chinese
  • ar - Arabic
  • hi - Hindi

More languages available in NLLB-200 documentation.

Whisper Models

Available models (trade-off between speed and accuracy):

  • tiny - Fastest, least accurate
  • base - Fast, decent accuracy
  • small - Balanced
  • medium - Good accuracy, slower
  • large - Best accuracy, slowest
  • turbo - Fast and accurate (default)

Output Formats

  • MP4 (default): Compatible but limited Unicode support for subtitles
  • MKV: Better subtitle support, recommended for Chinese/Japanese/Korean content

MKV uses ASS format with embedded font information for proper CJK character rendering.

Development

uv sync
uv run lsub video.mp4 -t en     # run the CLI using local code

# optional: editable install
uv pip install -e .

./scripts/release.sh            # release a new version

Notes

  • First run downloads Whisper model (~1.5GB for turbo) and NLLB-200 model (~1.2GB)
  • Models are cached in ~/.cache/huggingface/
  • Requires ffmpeg installed on your system
  • Generated subtitle files (.srt and .ass) are saved alongside the video
  • For best CJK (Chinese/Japanese/Korean) subtitle rendering, use MKV format (-f mkv)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subtool-0.0.2.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

subtool-0.0.2-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file subtool-0.0.2.tar.gz.

File metadata

  • Download URL: subtool-0.0.2.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.8.23

File hashes

Hashes for subtool-0.0.2.tar.gz
Algorithm Hash digest
SHA256 5f18c166703a5c67f17f910497a6caa481916cae8a6b7c9f6c43bb9b43760227
MD5 e0c5b78bb3e3be7b7227b4dd06a39708
BLAKE2b-256 75d84043da4b630b8f5f1197ea06462132df924a8b87091bca07d1c61efab073

See more details on using hashes here.

File details

Details for the file subtool-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: subtool-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.8.23

File hashes

Hashes for subtool-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1ccdf1a0d59adceb570189e90c937777c1e74896e445c7a48da76a630c7fe291
MD5 4bc06b1fdcb78db6903096de71087440
BLAKE2b-256 0870a59c1784d73c122ad5810965559e760fd30630ac9469c2193a449535522a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page