Generate SRT subtitles from video/audio files using Whisper
Project description
makesub
makesub is a command-line tool that automatically generates SRT subtitle files from any video or audio file. It uses OpenAI Whisper, a state-of-the-art speech recognition model, to transcribe spoken audio into accurate, timestamped subtitles.
No API key required. Everything runs locally on your machine.
makesub lecture.mp4
# → lecture.srt
Who is this for?
- Content creators who want subtitles for YouTube videos, reels, or podcasts
- Developers building subtitle pipelines
- Researchers transcribing interviews or recordings
- Anyone who needs fast, offline, accurate subtitles from a video file
Features
- Generates standard
.srtsubtitle files ready for use in any video editor or player - Powered by OpenAI Whisper — no internet connection or API key needed after install
- Supports 99+ languages with automatic language detection
- Native Apple Silicon support (MPS acceleration on M1/M2/M3 Macs)
- Handles MP4, MOV, MKV, AVI, MP3, WAV, M4A, and any format ffmpeg can read
- Clear error messages for common problems (missing ffmpeg, no audio track, silent video, etc.)
Requirements
- Python 3.9+
- ffmpeg — required for audio decoding
Install ffmpeg on macOS:
brew install ffmpeg
Install ffmpeg on Ubuntu/Debian:
sudo apt install ffmpeg
Installation
pip install makesub
Apple Silicon (M1/M2/M3): Install PyTorch first to ensure you get the MPS-accelerated build, then install makesub:
pip install torch pip install makesub
Usage
makesub <video_or_audio_file> [options]
The subtitle file is written to the same directory as the input file by default.
makesub video.mp4
# Output: video.srt
Options
| Flag | Default | Description |
|---|---|---|
--model |
auto |
Whisper model: tiny, base, small, medium, large, large-v3, turbo, or auto to pick based on available memory |
--language |
en |
Language code (e.g. en, fr, de, ja, zh). Use auto to detect automatically |
--output |
alongside input | Output .srt path or directory |
--device |
auto | Force compute device: cpu, mps, cuda |
--verbose |
off | Print each decoded segment in real time |
Examples
# Subtitle an English video (default)
makesub interview.mp4
# Use a more accurate model for better results
makesub documentary.mp4 --model medium
# Auto-detect the spoken language
makesub foreign_film.mp4 --language auto
# Subtitle a French video
makesub podcast.mp3 --language fr
# Save the subtitle file to a specific location
makesub recording.mov --output ~/Desktop/recording.srt
# Watch segments appear in real time (useful for long files)
makesub lecture.mp4 --verbose
Choosing a model
Larger models are slower but produce significantly more accurate subtitles. By default, makesub detects your available RAM and GPU memory and picks the largest model that fits comfortably.
| Available memory | Auto-selected model |
|---|---|
| < 4 GB RAM | tiny |
| 4–8 GB RAM | base |
| 8–16 GB RAM | small |
| 16 GB+ RAM | medium |
| 2–5 GB VRAM | small / medium |
| 10 GB+ VRAM | large-v3 |
You can always override with --model <name>.
| Model | Size | Relative Speed | Best For |
|---|---|---|---|
tiny |
75 MB | ~32x | Quick drafts, short clips |
base |
145 MB | ~16x | Everyday use (default) |
small |
465 MB | ~6x | Better accuracy, still fast |
medium |
1.5 GB | ~2x | High accuracy |
large-v3 |
3 GB | 1x | Best possible accuracy |
turbo |
810 MB | ~8x | Fast with good accuracy |
Models are downloaded automatically on first use and cached in ~/.cache/whisper/.
Supported file formats
Any format that ffmpeg can decode, including:
mp4 mov mkv avi webm flv m4v mp3 wav m4a aac ogg flac wma
Troubleshooting
ffmpeg not found
Install ffmpeg — see Requirements above.
No speech detected
Try --language auto if the video is not in English. Check that the video actually has an audio track.
Not enough memory to load the model
Switch to a smaller model: --model small or --model tiny.
Permission denied reading a file on macOS
Terminal may need Full Disk Access. Go to System Settings > Privacy & Security > Full Disk Access and enable your terminal app.
License
MIT
Acknowledgements
Built on top of OpenAI Whisper. Audio decoding powered by ffmpeg.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file makesub-0.2.0.tar.gz.
File metadata
- Download URL: makesub-0.2.0.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e88d567fad7595c716a332e95747cd30b8dfd29ebdcee77c963f11045af84faa
|
|
| MD5 |
f9de4550eac3a4d0f98c5266ecf154af
|
|
| BLAKE2b-256 |
f8bf545b419bf83fb5c29767e602da022ea4f3594f54d79f5b5b0e4c51455f9a
|
File details
Details for the file makesub-0.2.0-py3-none-any.whl.
File metadata
- Download URL: makesub-0.2.0-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
102113c2df96e41e03e7ca745735f3212d7081020b18d7a07b5946377178aa25
|
|
| MD5 |
31f074dbb234c11ecf4d2f52407b05e0
|
|
| BLAKE2b-256 |
37316f97192859a3fc60add95ed8a971066249fd34c4a9b0121ab1bc3bae1ae5
|