Skip to main content

Generate SRT subtitles from video/audio files using Whisper

Project description

makesub

makesub is a command-line tool that automatically generates SRT subtitle files from any video or audio file. It uses OpenAI Whisper, a state-of-the-art speech recognition model, to transcribe spoken audio into accurate, timestamped subtitles.

No API key required. Everything runs locally on your machine.

makesub lecture.mp4
# → lecture.srt

Who is this for?

  • Content creators who want subtitles for YouTube videos, reels, or podcasts
  • Developers building subtitle pipelines
  • Researchers transcribing interviews or recordings
  • Anyone who needs fast, offline, accurate subtitles from a video file

Features

  • Generates standard .srt subtitle files ready for use in any video editor or player
  • Powered by OpenAI Whisper — no internet connection or API key needed after install
  • Supports 99+ languages with automatic language detection
  • Native Apple Silicon support (MPS acceleration on M1/M2/M3 Macs)
  • Handles MP4, MOV, MKV, AVI, MP3, WAV, M4A, and any format ffmpeg can read
  • Clear error messages for common problems (missing ffmpeg, no audio track, silent video, etc.)

Requirements

  • Python 3.9+
  • ffmpeg — required for audio decoding

Install ffmpeg on macOS:

brew install ffmpeg

Install ffmpeg on Ubuntu/Debian:

sudo apt install ffmpeg

Installation

pip install makesub

Apple Silicon (M1/M2/M3): Install PyTorch first to ensure you get the MPS-accelerated build, then install makesub:

pip install torch
pip install makesub

Usage

makesub <video_or_audio_file> [options]

The subtitle file is written to the same directory as the input file by default.

makesub video.mp4
# Output: video.srt

Options

Flag Default Description
--model auto Whisper model: tiny, base, small, medium, large, large-v3, turbo, or auto to pick based on available memory
--language en Language code (e.g. en, fr, de, ja, zh). Use auto to detect automatically
--output alongside input Output .srt path or directory
--device auto Force compute device: cpu, mps, cuda
--verbose off Print each decoded segment in real time

Examples

# Subtitle an English video (default)
makesub interview.mp4

# Use a more accurate model for better results
makesub documentary.mp4 --model medium

# Auto-detect the spoken language
makesub foreign_film.mp4 --language auto

# Subtitle a French video
makesub podcast.mp3 --language fr

# Save the subtitle file to a specific location
makesub recording.mov --output ~/Desktop/recording.srt

# Watch segments appear in real time (useful for long files)
makesub lecture.mp4 --verbose

Choosing a model

Larger models are slower but produce significantly more accurate subtitles. By default, makesub detects your available RAM and GPU memory and picks the largest model that fits comfortably.

Available memory Auto-selected model
< 4 GB RAM tiny
4–8 GB RAM base
8–16 GB RAM small
16 GB+ RAM medium
2–5 GB VRAM small / medium
10 GB+ VRAM large-v3

You can always override with --model <name>.

Model Size Relative Speed Best For
tiny 75 MB ~32x Quick drafts, short clips
base 145 MB ~16x Everyday use (default)
small 465 MB ~6x Better accuracy, still fast
medium 1.5 GB ~2x High accuracy
large-v3 3 GB 1x Best possible accuracy
turbo 810 MB ~8x Fast with good accuracy

Models are downloaded automatically on first use and cached in ~/.cache/whisper/.


Supported file formats

Any format that ffmpeg can decode, including:

mp4 mov mkv avi webm flv m4v mp3 wav m4a aac ogg flac wma


Troubleshooting

ffmpeg not found Install ffmpeg — see Requirements above.

No speech detected Try --language auto if the video is not in English. Check that the video actually has an audio track.

Not enough memory to load the model Switch to a smaller model: --model small or --model tiny.

Permission denied reading a file on macOS Terminal may need Full Disk Access. Go to System Settings > Privacy & Security > Full Disk Access and enable your terminal app.


License

MIT


Acknowledgements

Built on top of OpenAI Whisper. Audio decoding powered by ffmpeg.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

makesub-0.2.0.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

makesub-0.2.0-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file makesub-0.2.0.tar.gz.

File metadata

  • Download URL: makesub-0.2.0.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.0

File hashes

Hashes for makesub-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e88d567fad7595c716a332e95747cd30b8dfd29ebdcee77c963f11045af84faa
MD5 f9de4550eac3a4d0f98c5266ecf154af
BLAKE2b-256 f8bf545b419bf83fb5c29767e602da022ea4f3594f54d79f5b5b0e4c51455f9a

See more details on using hashes here.

File details

Details for the file makesub-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: makesub-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.0

File hashes

Hashes for makesub-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 102113c2df96e41e03e7ca745735f3212d7081020b18d7a07b5946377178aa25
MD5 31f074dbb234c11ecf4d2f52407b05e0
BLAKE2b-256 37316f97192859a3fc60add95ed8a971066249fd34c4a9b0121ab1bc3bae1ae5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page