Skip to main content

Caption musical theater, concerts, or any recording using Genius lyrics and Whisper

Project description

Musical Recording Captioning

Caption recordings of musical theater, concerts, or any video that has lyrics on Genius.

How it works

  1. Fetches all track lyrics for an album from the Genius API
  2. Transcribes the audio locally using Whisper (via faster-whisper) to get word-level timestamps
  3. Fuzzy-matches each lyric line to the transcript to determine timing
  4. Writes a .srt subtitle file

All processing after the initial Genius lyrics fetch runs entirely offline.

Prerequisites

Installation

pip install musicalrecordingcaptioning

Or from source:

git clone https://github.com/you/MusicalRecordingCaptioning
cd MusicalRecordingCaptioning
python -m venv .venv && source .venv/bin/activate
pip install -e .

The installer automatically downloads the default Whisper model (small) so the first run works offline.

Configuration

Copy .env.example to .env and fill in your token:

GENIUS_ACCESS_TOKEN=your_token_here

Or pass it directly via --genius-token.

Usage

# Basic usage (token from .env, auto-detects language)
mrc recording.mp4 --url https://genius.com/albums/Lin-manuel-miranda/Hamilton

# Specify language explicitly for faster, more accurate transcription
mrc recording.mp4 --url https://genius.com/albums/Lin-manuel-miranda/Hamilton --language en

# French show
mrc spectacle.mp4 --url https://genius.com/albums/... --language fr

# Specify output file and a larger model for better accuracy
mrc show.mp4 --url https://genius.com/albums/Claude-michel-schonberg/Les-miserables-original-broadway-cast-recording \
    --output les-mis.srt --model medium --language fr

# Also caption dialogue and non-song audio with the raw Whisper transcript
mrc recording.mp4 --url https://genius.com/albums/Lin-manuel-miranda/Hamilton --keep-transcription

# Pass token explicitly
mrc concert.mp3 --url https://genius.com/albums/Anais-mitchell/Hadestown \
    --genius-token abc123

Options

Option Default Description
--url required Genius album URL (https://genius.com/albums/...)
--genius-token from .env Genius API access token
--output <input>.srt Output SRT file path
--model small Whisper model: tiny, base, small, medium, large
--language auto-detect Audio language as ISO 639-1 code (e.g. en, fr, de)
--keep-transcription off Fill gaps between songs with raw Whisper transcript captions

Caption format

Lyric lines are wrapped with ♪ symbols (♪ lyrics here ♪), matching standard subtitle convention for sung content. When --keep-transcription is enabled, dialogue and non-song audio appear as plain text captions.

Whisper models

Model Size Notes
tiny ~75 MB Fastest, least accurate
base ~145 MB Good for clear speech
small ~466 MB Default. Handles singing and background music well
medium ~1.5 GB High accuracy, slower
large ~3.1 GB Best accuracy, slow

Only the small model is downloaded at install time. Other models are downloaded on first use when specified via --model.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

musicalrecordingcaptioning-0.1.1.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

musicalrecordingcaptioning-0.1.1-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file musicalrecordingcaptioning-0.1.1.tar.gz.

File metadata

File hashes

Hashes for musicalrecordingcaptioning-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ebae70ebd846d5f4bef532ea7da43c5e7cf1668a517f487d7e02bf36c7310e62
MD5 2cde654aaa2f630c1312322b91f6a625
BLAKE2b-256 0f775c38adc78a58ad4578b90bff807ebeca0a59d06e1816f2963013daa6f596

See more details on using hashes here.

Provenance

The following attestation bundles were made for musicalrecordingcaptioning-0.1.1.tar.gz:

Publisher: python-publish.yml on athuler/MusicalRecordingCaptioning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file musicalrecordingcaptioning-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for musicalrecordingcaptioning-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8b40ff9cd1c2d7b3cf597e6fe2616de0eb4c31ccd9fbd3656904b7233d275555
MD5 3853a462f127038db84049620d10f423
BLAKE2b-256 6119e4bc0b8141eab7b39318f45eb7ea8a7e4864bb4513b8191f3b73754b1681

See more details on using hashes here.

Provenance

The following attestation bundles were made for musicalrecordingcaptioning-0.1.1-py3-none-any.whl:

Publisher: python-publish.yml on athuler/MusicalRecordingCaptioning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page