Caption musical theater, concerts, or any recording using Genius lyrics and Whisper
Project description
Musical Recording Captioning
Caption recordings of musical theater, concerts, or any video that has lyrics on Genius.
How it works
- Fetches all track lyrics for an album from the Genius API
- Transcribes the audio locally using Whisper (via
faster-whisper) to get word-level timestamps - Fuzzy-matches each lyric line to the transcript to determine timing
- Writes a
.srtsubtitle file
All processing after the initial Genius lyrics fetch runs entirely offline.
Prerequisites
- Python 3.10+
- ffmpeg installed and on your
PATH - A Genius API access token — get one at https://genius.com/api-clients
Installation
pip install musicalrecordingcaptioning
Or from source:
git clone https://github.com/you/MusicalRecordingCaptioning
cd MusicalRecordingCaptioning
python -m venv .venv && source .venv/bin/activate
pip install -e .
The installer automatically downloads the default Whisper model (small) so the first run works offline.
Configuration
Copy .env.example to .env and fill in your token:
GENIUS_ACCESS_TOKEN=your_token_here
Or pass it directly via --genius-token.
Usage
# Basic usage (token from .env, auto-detects language)
mrc recording.mp4 --url https://genius.com/albums/Lin-manuel-miranda/Hamilton
# Specify language explicitly for faster, more accurate transcription
mrc recording.mp4 --url https://genius.com/albums/Lin-manuel-miranda/Hamilton --language en
# French show
mrc spectacle.mp4 --url https://genius.com/albums/... --language fr
# Specify output file and a larger model for better accuracy
mrc show.mp4 --url https://genius.com/albums/Claude-michel-schonberg/Les-miserables-original-broadway-cast-recording \
--output les-mis.srt --model medium --language fr
# Also caption dialogue and non-song audio with the raw Whisper transcript
mrc recording.mp4 --url https://genius.com/albums/Lin-manuel-miranda/Hamilton --keep-transcription
# Pass token explicitly
mrc concert.mp3 --url https://genius.com/albums/Anais-mitchell/Hadestown \
--genius-token abc123
Options
| Option | Default | Description |
|---|---|---|
--url |
required | Genius album URL (https://genius.com/albums/...) |
--genius-token |
from .env |
Genius API access token |
--output |
<input>.srt |
Output SRT file path |
--model |
small |
Whisper model: tiny, base, small, medium, large |
--language |
auto-detect | Audio language as ISO 639-1 code (e.g. en, fr, de) |
--keep-transcription |
off | Fill gaps between songs with raw Whisper transcript captions |
Caption format
Lyric lines are wrapped with ♪ symbols (♪ lyrics here ♪), matching standard subtitle convention for sung content. When --keep-transcription is enabled, dialogue and non-song audio appear as plain text captions.
Whisper models
| Model | Size | Notes |
|---|---|---|
tiny |
~75 MB | Fastest, least accurate |
base |
~145 MB | Good for clear speech |
small |
~466 MB | Default. Handles singing and background music well |
medium |
~1.5 GB | High accuracy, slower |
large |
~3.1 GB | Best accuracy, slow |
Only the small model is downloaded at install time. Other models are downloaded on first use when specified via --model.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file musicalrecordingcaptioning-0.1.1.tar.gz.
File metadata
- Download URL: musicalrecordingcaptioning-0.1.1.tar.gz
- Upload date:
- Size: 9.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ebae70ebd846d5f4bef532ea7da43c5e7cf1668a517f487d7e02bf36c7310e62
|
|
| MD5 |
2cde654aaa2f630c1312322b91f6a625
|
|
| BLAKE2b-256 |
0f775c38adc78a58ad4578b90bff807ebeca0a59d06e1816f2963013daa6f596
|
Provenance
The following attestation bundles were made for musicalrecordingcaptioning-0.1.1.tar.gz:
Publisher:
python-publish.yml on athuler/MusicalRecordingCaptioning
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
musicalrecordingcaptioning-0.1.1.tar.gz -
Subject digest:
ebae70ebd846d5f4bef532ea7da43c5e7cf1668a517f487d7e02bf36c7310e62 - Sigstore transparency entry: 1437187710
- Sigstore integration time:
-
Permalink:
athuler/MusicalRecordingCaptioning@eac4771f849d292156c2eef435c92929186d6c70 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/athuler
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@eac4771f849d292156c2eef435c92929186d6c70 -
Trigger Event:
release
-
Statement type:
File details
Details for the file musicalrecordingcaptioning-0.1.1-py3-none-any.whl.
File metadata
- Download URL: musicalrecordingcaptioning-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b40ff9cd1c2d7b3cf597e6fe2616de0eb4c31ccd9fbd3656904b7233d275555
|
|
| MD5 |
3853a462f127038db84049620d10f423
|
|
| BLAKE2b-256 |
6119e4bc0b8141eab7b39318f45eb7ea8a7e4864bb4513b8191f3b73754b1681
|
Provenance
The following attestation bundles were made for musicalrecordingcaptioning-0.1.1-py3-none-any.whl:
Publisher:
python-publish.yml on athuler/MusicalRecordingCaptioning
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
musicalrecordingcaptioning-0.1.1-py3-none-any.whl -
Subject digest:
8b40ff9cd1c2d7b3cf597e6fe2616de0eb4c31ccd9fbd3656904b7233d275555 - Sigstore transparency entry: 1437187713
- Sigstore integration time:
-
Permalink:
athuler/MusicalRecordingCaptioning@eac4771f849d292156c2eef435c92929186d6c70 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/athuler
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@eac4771f849d292156c2eef435c92929186d6c70 -
Trigger Event:
release
-
Statement type: