Generate karaoke videos with synchronized lyrics. Handles the entire process from downloading audio and lyrics to creating the final video with title screens.

These details have not been verified by PyPI

Project links

Project description

Karaoke Generator 🎶 🎥 🚀

PyPI - Version PyPI - Python Version Tests Test Coverage

Generate professional karaoke videos with instrumental audio and synchronized lyrics. Available as a local CLI (karaoke-gen) or cloud-based CLI (karaoke-gen-remote) that offloads processing to Google Cloud.

✨ Two Ways to Generate Karaoke

1. Local CLI (`karaoke-gen`)

Run all processing locally on your machine. Requires GPU for optimal audio separation performance.

karaoke-gen "ABBA" "Waterloo"

2. Remote CLI (`karaoke-gen-remote`)

Offload all processing to a cloud backend. No GPU required - just authenticate and submit jobs.

karaoke-gen-remote ./song.flac "ABBA" "Waterloo"

Both CLIs produce identical outputs: 4K karaoke videos, CDG+MP3 packages, audio stems, and more.

🎯 Features

Core Pipeline

Audio Separation: AI-powered vocal/instrumental separation using MDX and Demucs models
Lyrics Transcription: Word-level timestamps via AudioShake API
Lyrics Correction: Match transcription against online lyrics (Genius, Spotify, Musixmatch)
Human Review: Interactive UI for correcting lyrics before final render
Video Rendering: High-quality 4K karaoke videos with customizable styles
Multiple Outputs: MP4 (4K lossless/lossy, 720p), MKV, CDG+MP3, TXT+MP3

Distribution Features

YouTube Upload: Automatic upload to your YouTube channel
Dropbox Integration: Organize output in brand-coded folders
Google Drive: Upload to public share folders
Discord Notifications: Webhook notifications on completion

📦 Installation

pip install karaoke-gen

This installs both karaoke-gen (local) and karaoke-gen-remote (cloud) CLIs.

Requirements

Python 3.10-3.13
FFmpeg
For local processing: CUDA-capable GPU or Apple Silicon CPU recommended

Transcription Provider Setup

Transcription is required for creating karaoke videos with synchronized lyrics. The system needs word-level timing data to display lyrics in sync with the music.

Option 1: AudioShake (Recommended)

Commercial service with high-quality transcription. Best for production use.

export AUDIOSHAKE_API_TOKEN="your_audioshake_token"

Get an API key at https://www.audioshake.ai/ - business only, at time of writing this.

Option 2: Local Whisper (No Cloud Required)

Run Whisper directly on your local machine using whisper-timestamped. Works on CPU, NVIDIA GPU (CUDA), or Apple Silicon.

# Install with local Whisper support
pip install "karaoke-gen[local-whisper]"

# Optional: Configure model size (tiny, base, small, medium, large)
export WHISPER_MODEL_SIZE="medium"

# Optional: Force specific device (cpu, cuda, mps)
export WHISPER_DEVICE="cpu"

Model Size Guide:

Model	VRAM	Speed	Quality
tiny	~1GB	Fast	Lower
base	~1GB	Fast	Basic
small	~2GB	Medium	Good
medium	~5GB	Slower	Better
large	~10GB	Slowest	Best

CPU-Only Installation (no GPU required):

# Pre-install CPU-only PyTorch first
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install "karaoke-gen[local-whisper]"

Local Whisper runs automatically as a fallback when no cloud transcription services are configured.

Option 3: Whisper via RunPod

Cloud-based alternative using OpenAI's Whisper model on RunPod infrastructure.

export RUNPOD_API_KEY="your_runpod_key"
export WHISPER_RUNPOD_ID="your_whisper_endpoint_id"

Set up a Whisper endpoint at https://www.runpod.io/

Without Transcription (Instrumental Only)

If you don't need synchronized lyrics, use the --skip-lyrics flag:

karaoke-gen --skip-lyrics "Artist" "Title"

This creates an instrumental-only karaoke video without lyrics overlay.

Note: See lyrics_transcriber_temp/README.md for detailed transcription provider configuration options.

🖥️ Local CLI (`karaoke-gen`)

Basic Usage

# Generate from local audio file
karaoke-gen ./song.mp3 "Artist Name" "Song Title"

# Search and download audio automatically
karaoke-gen "Rick Astley" "Never Gonna Give You Up"

# Process from YouTube URL
karaoke-gen "https://www.youtube.com/watch?v=dQw4w9WgXcQ" "Rick Astley" "Never Gonna Give You Up"

Remote Audio Separation (Optional)

Offload just the GPU-intensive audio separation to Modal.com while keeping other processing local:

export AUDIO_SEPARATOR_API_URL="https://USERNAME--audio-separator-api.modal.run"
karaoke-gen "Artist" "Title"

Key Options

# Custom styling
karaoke-gen --style_params_json="./styles.json" "Artist" "Title"

# Generate CDG and TXT packages
karaoke-gen --enable_cdg --enable_txt "Artist" "Title"

# Skip video encoding (CDG/TXT only, faster)
karaoke-gen --no-video --enable_cdg "Artist" "Title"

# YouTube upload
karaoke-gen --enable_youtube_upload --youtube_description_file="./desc.txt" "Artist" "Title"

# Full production run
karaoke-gen \
  --style_params_json="./branding.json" \
  --enable_cdg \
  --enable_txt \
  --brand_prefix="BRAND" \
  --enable_youtube_upload \
  --youtube_description_file="./description.txt" \
  "Artist" "Title"

Full Options Reference

karaoke-gen --help

☁️ Remote CLI (`karaoke-gen-remote`)

The remote CLI submits jobs to a Google Cloud backend that handles all processing. You don't need a GPU or any audio processing libraries installed locally.

Setup

Set the backend URL:

export KARAOKE_GEN_URL="https://api.nomadkaraoke.com"  # Or your own backend

Authenticate with Google Cloud:
```
gcloud auth login
```

Basic Usage

# Submit a job
karaoke-gen-remote ./song.flac "ABBA" "Waterloo"

# The CLI will:
# 1. Upload your audio file
# 2. Monitor processing progress
# 3. Open lyrics review UI when ready
# 4. Prompt for instrumental selection
# 5. Download all outputs when complete

Job Management

# List all jobs
karaoke-gen-remote --list

# Resume monitoring an existing job
karaoke-gen-remote --resume abc12345

# Cancel a running job
karaoke-gen-remote --cancel abc12345

# Delete a job and its files
karaoke-gen-remote --delete abc12345

Full Production Run

karaoke-gen-remote \
  --style_params_json="./karaoke-styles.json" \
  --enable_cdg \
  --enable_txt \
  --brand_prefix=NOMAD \
  --enable_youtube_upload \
  --youtube_description_file="./youtube-description.txt" \
  ./song.flac "Artist" "Title"

Environment Variables

Variable	Description	Default
`KARAOKE_GEN_URL`	Backend service URL	Required
`KARAOKE_GEN_AUTH_TOKEN`	Admin auth token (for protected endpoints)	Optional
`REVIEW_UI_URL`	Lyrics review UI URL	`https://gen.nomadkaraoke.com/lyrics/`
`POLL_INTERVAL`	Seconds between status polls	`5`

Note: The REVIEW_UI_URL defaults to the hosted lyrics review UI. For local development, set it to http://localhost:5173 if you're running the frontend dev server.

Authentication

The backend uses token-based authentication for admin operations (bulk delete, internal worker triggers). For basic job submission and monitoring, authentication is optional.

For admin access:

export KARAOKE_GEN_AUTH_TOKEN="your-admin-token"

The token must match one of the tokens configured in the backend's ADMIN_TOKENS environment variable.

Non-Interactive Mode

For automated/CI usage:

karaoke-gen-remote -y ./song.flac "Artist" "Title"

The -y flag auto-accepts default corrections and selects clean instrumental.

🎨 Style Configuration

Create a styles.json file to customize the karaoke video appearance:

{
  "intro": {
    "video_duration": 5,
    "background_image": "/path/to/title-background.png",
    "font": "/path/to/Font.ttf",
    "artist_color": "#ffdf6b",
    "title_color": "#ffffff"
  },
  "karaoke": {
    "background_image": "/path/to/karaoke-background.png",
    "font_path": "/path/to/Font.ttf"
  },
  "end": {
    "background_image": "/path/to/end-background.png"
  },
  "cdg": {
    "font_path": "/path/to/Font.ttf",
    "instrumental_background": "/path/to/cdg-background.png"
  }
}

When using karaoke-gen-remote, all referenced files are automatically uploaded with your job.

📤 Output Files

A completed job produces:

BRAND-1234 - Artist - Title/
├── Artist - Title (Final Karaoke Lossless 4k).mp4    # ProRes 4K
├── Artist - Title (Final Karaoke Lossless 4k).mkv    # FLAC audio 4K
├── Artist - Title (Final Karaoke Lossy 4k).mp4       # H.264 4K
├── Artist - Title (Final Karaoke Lossy 720p).mp4     # H.264 720p
├── Artist - Title (Final Karaoke CDG).zip            # CDG+MP3 package
├── Artist - Title (Final Karaoke TXT).zip            # TXT+MP3 package
├── Artist - Title (Karaoke).cdg                      # Individual CDG
├── Artist - Title (Karaoke).mp3                      # Karaoke audio
├── Artist - Title (Karaoke).lrc                      # LRC lyrics
├── Artist - Title (Karaoke).ass                      # ASS subtitles
├── Artist - Title (Title).mov                        # Title screen video
├── Artist - Title (End).mov                          # End screen video
├── Artist - Title (Instrumental...).flac             # Clean instrumental
├── Artist - Title (Instrumental +BV...).flac         # With backing vocals
└── stems/                                            # All audio stems
    ├── ...Vocals....flac
    ├── ...Bass....flac
    ├── ...Drums....flac
    └── ...

🏗️ Deploy Your Own Backend

The cloud backend runs on Google Cloud Platform using:

Cloud Run: Serverless API hosting
Firestore: Job state management
Cloud Storage: File uploads and outputs
Modal.com: GPU-accelerated audio separation
AudioShake: Lyrics transcription API

Prerequisites

Google Cloud account with billing enabled
Pulumi CLI
Modal.com account (for audio separation)
AudioShake API key

Infrastructure Setup

cd infrastructure

# Install dependencies
pip install -r requirements.txt

# Login to Pulumi
pulumi login

# Create a stack
pulumi stack init prod

# Configure GCP project
pulumi config set gcp:project your-project-id
pulumi config set gcp:region us-central1

# Deploy infrastructure
pulumi up

This creates:

Firestore database
Cloud Storage bucket
Artifact Registry
Service account with IAM roles
Secret Manager secrets (you add values)

Add Secret Values

# AudioShake API key
echo -n "your-audioshake-key" | gcloud secrets versions add audioshake-api-key --data-file=-

# Genius API key
echo -n "your-genius-key" | gcloud secrets versions add genius-api-key --data-file=-

# Modal API URL
echo -n "https://your-modal-url" | gcloud secrets versions add audio-separator-api-url --data-file=-

# YouTube OAuth credentials (JSON)
gcloud secrets versions add youtube-oauth-credentials --data-file=./youtube-creds.json

# Dropbox OAuth credentials (JSON)
gcloud secrets versions add dropbox-oauth-credentials --data-file=./dropbox-creds.json

# Google Drive service account (JSON)
gcloud secrets versions add gdrive-service-account --data-file=./gdrive-sa.json

Deploy Cloud Run

Deployments happen automatically via GitHub Actions CI when pushing to main. See .github/workflows/ci.yml for the full deployment workflow.

Point CLI to Your Backend

export KARAOKE_GEN_URL="https://your-backend.run.app"
karaoke-gen-remote ./song.flac "Artist" "Title"

🔌 Backend API Reference

The backend exposes a REST API for job management.

Job Submission

POST /api/jobs/upload

Submit a new karaoke generation job with audio file and options.

curl -X POST "https://api.example.com/api/jobs/upload" \
  -F "file=@song.flac" \
  -F "artist=ABBA" \
  -F "title=Waterloo" \
  -F "enable_cdg=true" \
  -F "enable_txt=true" \
  -F "brand_prefix=NOMAD" \
  -F "style_params=@styles.json" \
  -F "style_karaoke_background=@background.png"

Job Status

GET /api/jobs/{job_id}

Get job status and details.

curl "https://api.example.com/api/jobs/abc12345"

List Jobs

GET /api/jobs

List all jobs with optional status filter.

curl "https://api.example.com/api/jobs?status=complete&limit=10"

Cancel Job

POST /api/jobs/{job_id}/cancel

Cancel a running job.

curl -X POST "https://api.example.com/api/jobs/abc12345/cancel" \
  -H "Content-Type: application/json" \
  -d '{"reason": "User cancelled"}'

Delete Job

DELETE /api/jobs/{job_id}

Delete a job and its files.

curl -X DELETE "https://api.example.com/api/jobs/abc12345?delete_files=true"

Lyrics Review

GET /api/review/{job_id}/correction-data

Get correction data for lyrics review.

POST /api/review/{job_id}/complete

Submit corrected lyrics and trigger video rendering.

Instrumental Selection

GET /api/jobs/{job_id}/instrumental-options

Get available instrumental options.

POST /api/jobs/{job_id}/select-instrumental

Submit instrumental selection (clean or with_backing).

curl -X POST "https://api.example.com/api/jobs/abc12345/select-instrumental" \
  -H "Content-Type: application/json" \
  -d '{"selection": "clean"}'

Download Files

GET /api/jobs/{job_id}/download-urls

Get download URLs for all output files.

GET /api/jobs/{job_id}/download/{category}/{file_key}

Stream download a specific file.

Health Check

GET /api/health

Check backend health status.

🔧 Troubleshooting

"No suitable files found for processing"

This error occurs during the finalisation step when the (With Vocals).mkv file is missing. This file is created during lyrics transcription.

Most common cause: No transcription provider configured.

Quick fix:

Check if transcription providers are configured:

echo $AUDIOSHAKE_API_TOKEN
echo $RUNPOD_API_KEY

If both are empty, set up a provider (see Transcription Provider Setup)
Or use --skip-lyrics for instrumental-only karaoke:
```
karaoke-gen --skip-lyrics "Artist" "Title"
```

Other causes:

Invalid API credentials - verify your tokens are correct and active
API service unavailable - check service status pages
Network connectivity issues - ensure you can reach the API endpoints
Transcription timeout - try again or use a different provider

Transcription Fails Silently

If karaoke-gen runs without errors but produces no synchronized lyrics:

Check logs - Run with --log_level debug for detailed output:
```
karaoke-gen --log_level debug "Artist" "Title"
```

Verify environment variables - Ensure API tokens are exported in your shell:

# Check if set
printenv | grep -E "(AUDIOSHAKE|RUNPOD|WHISPER)"

# Set in current session
export AUDIOSHAKE_API_TOKEN="your_token"

Test API connectivity - Verify you can reach the transcription service

"No lyrics found from any source"

This warning means no reference lyrics were fetched from online sources (Genius, Spotify, Musixmatch). The transcription will still work, but auto-correction may be less accurate.

To fix:

Set GENIUS_API_TOKEN for Genius lyrics
Set SPOTIFY_COOKIE_SP_DC for Spotify lyrics
Set RAPIDAPI_KEY for Musixmatch lyrics
Or provide lyrics manually with --lyrics_file /path/to/lyrics.txt

Video Quality Issues

If the output video has quality problems:

Ensure FFmpeg is properly installed: ffmpeg -version
Check available codecs: ffmpeg -codecs
For 4K output, ensure sufficient disk space (10GB+ per track)

Local Whisper Issues

GPU Out of Memory

If you get CUDA out of memory errors:

# Use a smaller model
export WHISPER_MODEL_SIZE="small"  # or "tiny"

# Or force CPU mode
export WHISPER_DEVICE="cpu"

Slow Transcription on CPU

CPU transcription is significantly slower than GPU. For faster processing:

Use a smaller model (tiny or base)
Consider using cloud transcription (AudioShake or RunPod)
On Apple Silicon, the small model offers good speed/quality balance

Model Download Issues

Whisper models are downloaded on first use (~1-3GB depending on size). If downloads fail:

Check your internet connection
Set a custom cache directory: export WHISPER_CACHE_DIR="/path/with/space"
Models are cached in ~/.cache/whisper/ by default

whisper-timestamped Not Found

If you get "whisper-timestamped is not installed":

pip install "karaoke-gen[local-whisper]"
# Or install directly:
pip install whisper-timestamped

Disabling Local Whisper

If you want to disable local Whisper (e.g., to force cloud transcription):

export ENABLE_LOCAL_WHISPER="false"

🧪 Development

Running Tests

# Run all tests
pytest tests/ backend/tests/ -v

# Run only unit tests
pytest tests/unit/ -v

# Run with coverage
pytest tests/unit/ -v --cov=karaoke_gen --cov-report=term-missing

Project Structure

karaoke-gen/
├── karaoke_gen/           # Core CLI package
│   ├── utils/
│   │   ├── gen_cli.py     # Local CLI (karaoke-gen)
│   │   └── remote_cli.py  # Remote CLI (karaoke-gen-remote)
│   ├── karaoke_finalise/  # Video encoding, packaging, distribution
│   └── style_loader.py    # Unified style configuration
├── backend/               # Cloud backend (FastAPI)
│   ├── api/routes/        # API endpoints
│   ├── workers/           # Background processing workers
│   └── services/          # Business logic services
├── infrastructure/        # Pulumi IaC for GCP
├── docs/                  # Documentation
└── tests/                 # Test suite

📄 License

MIT

🤝 Contributing

Contributions are welcome! Please see our contributing guidelines.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.162.2

Apr 4, 2026

0.162.1

Apr 4, 2026

0.162.0

Apr 4, 2026

0.161.2

Apr 4, 2026

0.161.1

Apr 4, 2026

0.161.0

Apr 4, 2026

0.160.2

Apr 4, 2026

0.160.1

Apr 4, 2026

0.160.0

Apr 4, 2026

0.159.0

Apr 4, 2026

0.158.0

Apr 4, 2026

0.157.4

Apr 3, 2026

0.157.3

Apr 2, 2026

0.157.2

Mar 31, 2026

0.157.1

Mar 31, 2026

0.157.0

Mar 31, 2026

0.156.1

Mar 31, 2026

0.156.0

Mar 31, 2026

0.155.3

Mar 31, 2026

0.155.2

Mar 31, 2026

0.155.1

Mar 30, 2026

0.155.0

Mar 30, 2026

0.154.2

Mar 30, 2026

0.154.1

Mar 30, 2026

0.154.0

Mar 30, 2026

0.153.7

Mar 30, 2026

0.153.6

Mar 29, 2026

0.153.5

Mar 28, 2026

0.153.4

Mar 27, 2026

0.153.3

Mar 27, 2026

0.153.2

Mar 27, 2026

0.153.1

Mar 26, 2026

0.153.0

Mar 26, 2026

0.152.3

Mar 26, 2026

0.152.2

Mar 25, 2026

0.152.1

Mar 25, 2026

0.152.0

Mar 25, 2026

0.151.1

Mar 25, 2026

0.151.0

Mar 25, 2026

0.150.5

Mar 25, 2026

0.150.4

Mar 25, 2026

0.150.3

Mar 25, 2026

0.150.2

Mar 24, 2026

0.150.1

Mar 24, 2026

0.150.0

Mar 24, 2026

0.149.0

Mar 23, 2026

0.148.3

Mar 23, 2026

0.148.2

Mar 23, 2026

0.148.1

Mar 23, 2026

0.148.0

Mar 23, 2026

0.147.0

Mar 21, 2026

0.146.0

Mar 21, 2026

0.145.0

Mar 21, 2026

0.144.0

Mar 21, 2026

0.143.3

Mar 21, 2026

0.143.2

Mar 21, 2026

0.143.1

Mar 20, 2026

0.143.0

Mar 19, 2026

0.142.5

Mar 19, 2026

0.142.4

Mar 16, 2026

0.142.3

Mar 15, 2026

0.142.2

Mar 11, 2026

0.142.1

Mar 10, 2026

0.142.0

Mar 10, 2026

0.141.2

Mar 10, 2026

0.141.1

Mar 9, 2026

0.141.0

Mar 9, 2026

0.140.3

Mar 9, 2026

0.140.2

Mar 9, 2026

0.140.1

Mar 9, 2026

0.140.0

Mar 9, 2026

0.139.1

Mar 8, 2026

0.139.0

Mar 8, 2026

0.138.2

Mar 8, 2026

0.138.1

Mar 8, 2026

0.138.0

Mar 8, 2026

0.137.2

Mar 8, 2026

0.137.1

Mar 8, 2026

0.137.0

Mar 7, 2026

0.136.2

Mar 7, 2026

0.136.1

Mar 7, 2026

0.136.0

Mar 7, 2026

0.135.2

Mar 6, 2026

0.135.1

Mar 5, 2026

0.135.0

Mar 5, 2026

0.134.1

Mar 5, 2026

0.134.0

Mar 5, 2026

0.133.4

Mar 5, 2026

0.133.3

Mar 5, 2026

0.133.2

Mar 5, 2026

0.133.1

Mar 5, 2026

0.133.0

Mar 5, 2026

0.132.0

Mar 4, 2026

0.131.0

Mar 4, 2026

0.130.0

Mar 4, 2026

0.129.0

Mar 4, 2026

0.128.0

Mar 4, 2026

0.126.2

Mar 4, 2026

0.126.1

Mar 4, 2026

0.126.0

Mar 4, 2026

0.125.4

Mar 4, 2026

0.125.3

Mar 4, 2026

0.125.2

Mar 4, 2026

0.125.1

Mar 4, 2026

0.125.0

Mar 4, 2026

0.124.0

Mar 3, 2026

0.123.0

Mar 3, 2026

0.122.6

Mar 3, 2026

0.122.5

Mar 3, 2026

0.122.4

Mar 2, 2026

0.122.3

Mar 2, 2026

0.122.2

Mar 2, 2026

0.122.1

Mar 2, 2026

0.122.0

Mar 2, 2026

0.121.0

Mar 2, 2026

0.120.2

Mar 2, 2026

0.120.1

Mar 1, 2026

0.120.0

Mar 1, 2026

0.119.10

Feb 28, 2026

0.119.9

Feb 27, 2026

0.119.7

Feb 27, 2026

0.119.6

Feb 27, 2026

0.119.5

Feb 26, 2026

0.119.4

Feb 26, 2026

0.119.3

Feb 26, 2026

0.119.2

Feb 25, 2026

0.119.1

Feb 25, 2026

0.119.0

Feb 25, 2026

0.118.7

Feb 24, 2026

0.118.6

Feb 23, 2026

0.118.5

Feb 22, 2026

0.118.4

Feb 22, 2026

0.118.3

Feb 21, 2026

0.118.1

Feb 21, 2026

0.118.0

Feb 20, 2026

0.117.2

Feb 19, 2026

0.117.1

Feb 17, 2026

0.117.0

Feb 12, 2026

0.116.3

Feb 12, 2026

0.116.2

Feb 11, 2026

0.116.1

Feb 10, 2026

0.116.0

Feb 10, 2026

This version

0.115.5

Feb 5, 2026

0.115.4

Feb 5, 2026

0.115.3

Feb 5, 2026

0.115.2

Feb 5, 2026

0.115.1

Feb 5, 2026

0.115.0

Feb 5, 2026

0.114.17

Feb 4, 2026

0.114.16

Feb 3, 2026

0.114.15

Jan 30, 2026

0.114.14

Jan 28, 2026

0.114.13

Jan 27, 2026

0.114.12

Jan 27, 2026

0.114.11

Jan 27, 2026

0.114.10

Jan 26, 2026

0.114.9

Jan 26, 2026

0.114.8

Jan 25, 2026

0.114.7

Jan 25, 2026

0.114.6

Jan 25, 2026

0.114.5

Jan 25, 2026

0.114.4

Jan 25, 2026

0.114.3

Jan 25, 2026

0.114.2

Jan 24, 2026

0.114.1

Jan 24, 2026

0.114.0

Jan 24, 2026

0.113.0

Jan 24, 2026

0.112.1

Jan 24, 2026

0.112.0

Jan 24, 2026

0.110.1

Jan 24, 2026

0.110.0

Jan 23, 2026

0.109.0

Jan 23, 2026

0.108.14

Jan 22, 2026

0.108.13

Jan 22, 2026

0.108.12

Jan 22, 2026

0.108.11

Jan 21, 2026

0.108.10

Jan 21, 2026

0.108.9

Jan 21, 2026

0.108.8

Jan 20, 2026

0.108.7

Jan 20, 2026

0.108.6

Jan 20, 2026

0.108.5

Jan 20, 2026

0.108.4

Jan 20, 2026

0.108.3

Jan 20, 2026

0.108.2

Jan 19, 2026

0.108.1

Jan 19, 2026

0.108.0

Jan 19, 2026

0.107.0

Jan 19, 2026

0.106.1

Jan 19, 2026

0.106.0

Jan 19, 2026

0.105.7

Jan 19, 2026

0.105.6

Jan 16, 2026

0.105.5

Jan 15, 2026

0.105.4

Jan 15, 2026

0.105.3

Jan 13, 2026

0.105.2

Jan 13, 2026

0.105.1

Jan 11, 2026

0.105.0

Jan 10, 2026

0.104.0

Jan 10, 2026

0.103.2

Jan 10, 2026

0.103.1

Jan 10, 2026

0.103.0

Jan 10, 2026

0.102.0

Jan 10, 2026

0.101.0

Jan 10, 2026

0.100.0

Jan 10, 2026

0.99.9

Jan 9, 2026

0.99.8

Jan 9, 2026

0.99.7

Jan 9, 2026

0.99.6

Jan 9, 2026

0.99.5

Jan 9, 2026

0.99.4

Jan 8, 2026

0.99.3

Jan 8, 2026

0.99.2

Jan 8, 2026

0.99.0

Jan 8, 2026

0.98.0

Jan 8, 2026

0.97.1

Jan 8, 2026

0.97.0

Jan 8, 2026

0.96.0

Jan 8, 2026

0.95.4

Jan 7, 2026

0.95.2

Jan 7, 2026

0.95.1

Jan 7, 2026

0.95.0

Jan 7, 2026

0.94.0

Jan 6, 2026

0.93.0

Jan 6, 2026

0.92.0

Jan 6, 2026

0.91.5

Jan 6, 2026

0.91.4

Jan 6, 2026

0.91.2

Jan 5, 2026

0.91.1

Jan 5, 2026

0.91.0

Jan 5, 2026

0.90.1

Jan 5, 2026

0.90.0

Jan 5, 2026

0.89.6

Jan 4, 2026

0.89.5

Jan 4, 2026

0.89.4

Jan 4, 2026

0.89.3

Jan 4, 2026

0.89.2

Jan 4, 2026

0.89.1

Jan 4, 2026

0.89.0

Jan 4, 2026

0.88.0

Jan 4, 2026

0.87.1

Jan 3, 2026

0.87.0

Jan 3, 2026

0.86.7

Jan 3, 2026

0.86.6

Jan 3, 2026

0.86.5

Jan 1, 2026

0.86.4

Jan 1, 2026

0.86.3

Jan 1, 2026

0.86.2

Dec 31, 2025

0.86.0

Dec 31, 2025

0.85.0

Dec 31, 2025

0.84.0

Dec 31, 2025

0.83.0

Dec 31, 2025

0.82.0

Dec 30, 2025

0.81.6

Dec 30, 2025

0.81.5

Dec 30, 2025

0.81.4

Dec 30, 2025

0.81.3

Dec 30, 2025

0.81.2

Dec 30, 2025

0.81.1

Dec 30, 2025

0.81.0

Dec 30, 2025

0.80.0

Dec 30, 2025

0.78.0

Dec 29, 2025

0.77.0

Dec 29, 2025

0.76.30

Dec 28, 2025

0.76.29

Dec 28, 2025

0.76.28

Dec 28, 2025

0.76.27

Dec 28, 2025

0.76.26

Dec 28, 2025

0.76.25

Dec 28, 2025

0.76.24

Dec 28, 2025

0.76.23

Dec 28, 2025

0.76.22

Dec 28, 2025

0.76.21

Dec 28, 2025

0.76.20

Dec 28, 2025

0.76.18

Dec 28, 2025

0.76.17

Dec 28, 2025

0.76.15

Dec 28, 2025

0.76.14

Dec 28, 2025

0.76.12

Dec 28, 2025

0.76.11

Dec 28, 2025

0.76.10

Dec 27, 2025

0.76.8

Dec 27, 2025

0.76.7

Dec 27, 2025

0.76.6

Dec 27, 2025

0.76.5

Dec 27, 2025

0.76.4

Dec 27, 2025

0.76.3

Dec 27, 2025

0.76.2

Dec 27, 2025

0.75.59

Dec 27, 2025

0.75.58

Dec 26, 2025

0.75.57

Dec 25, 2025

0.75.54

Dec 25, 2025

0.75.53

Dec 25, 2025

0.75.52

Dec 25, 2025

0.75.51

Dec 25, 2025

0.75.50

Dec 25, 2025

0.75.49

Dec 25, 2025

0.75.48

Dec 25, 2025

0.75.46

Dec 25, 2025

0.75.45

Dec 25, 2025

0.75.44

Dec 25, 2025

0.75.43

Dec 25, 2025

0.75.40

Dec 24, 2025

0.75.37

Dec 24, 2025

0.75.35

Dec 24, 2025

0.75.33

Dec 24, 2025

0.75.27

Dec 23, 2025

0.75.26

Dec 23, 2025

0.75.25

Dec 23, 2025

0.75.23

Dec 23, 2025

0.75.19

Dec 22, 2025

0.75.18

Dec 22, 2025

0.75.16

Dec 22, 2025

0.75.15

Dec 22, 2025

0.75.10

Dec 22, 2025

0.75.8

Dec 22, 2025

0.75.7

Dec 22, 2025

0.75.5

Dec 22, 2025

0.75.2

Dec 22, 2025

0.75.1

Dec 22, 2025

0.75.0

Dec 22, 2025

0.74.3

Dec 22, 2025

0.74.2

Dec 22, 2025

0.73.3

Dec 20, 2025

0.73.1

Dec 20, 2025

0.73.0

Dec 20, 2025

0.72.2

Dec 20, 2025

0.72.1

Dec 20, 2025

0.72.0

Dec 19, 2025

0.71.69

Dec 19, 2025

0.71.68

Dec 19, 2025

0.71.67

Dec 18, 2025

0.71.66

Dec 18, 2025

0.71.63

Dec 18, 2025

0.71.60

Dec 18, 2025

0.71.47

Dec 18, 2025

0.71.44

Dec 18, 2025

0.71.42

Dec 14, 2025

0.71.41

Dec 12, 2025

0.71.40

Dec 12, 2025

0.71.36

Dec 11, 2025

0.71.35

Dec 11, 2025

0.71.29

Dec 11, 2025

0.71.27

Dec 11, 2025

0.71.26

Dec 11, 2025

0.71.24

Dec 11, 2025

0.71.23

Dec 11, 2025

0.71.19

Dec 11, 2025

0.71.17

Dec 11, 2025

0.71.16

Dec 10, 2025

0.66.1

Oct 26, 2025

0.66.0

Oct 26, 2025

0.65.0

Oct 24, 2025

0.64.0

Sep 30, 2025

0.63.1

Sep 28, 2025

0.62.1

Sep 25, 2025

0.62.0

Aug 31, 2025

0.61.1

Aug 8, 2025

0.61.0

Jul 26, 2025

0.60.0

Jul 25, 2025

0.59.0

Jul 19, 2025

0.58.0

Jul 12, 2025

0.57.0

Jul 8, 2025

0.56.0

Jul 5, 2025

0.55.0

Jul 2, 2025

0.50.0

Jun 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

karaoke_gen-0.115.5.tar.gz (5.6 MB view details)

Uploaded Feb 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

karaoke_gen-0.115.5-py3-none-any.whl (6.0 MB view details)

Uploaded Feb 5, 2026 Python 3

File details

Details for the file karaoke_gen-0.115.5.tar.gz.

File metadata

Download URL: karaoke_gen-0.115.5.tar.gz
Upload date: Feb 5, 2026
Size: 5.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.2 CPython/3.13.11 Linux/6.11.0-1018-azure

File hashes

Hashes for karaoke_gen-0.115.5.tar.gz
Algorithm	Hash digest
SHA256	`29e42e061f207424bb3859f5a1e563c11a73c9b4b946a649547e20f68545dabc`
MD5	`7f301a8bf75544c14635c9aa24bd5e21`
BLAKE2b-256	`035127a85a573ceec90afdc6e511b8cb9384fae46eda831d87dc682195039c32`

See more details on using hashes here.

File details

Details for the file karaoke_gen-0.115.5-py3-none-any.whl.

File metadata

Download URL: karaoke_gen-0.115.5-py3-none-any.whl
Upload date: Feb 5, 2026
Size: 6.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.2 CPython/3.13.11 Linux/6.11.0-1018-azure

File hashes

Hashes for karaoke_gen-0.115.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a7b90123d97709fbb93bfd3d139df4be88ecb34353c95084ef99a7f717f0f67f`
MD5	`68e60862622e31936f8d4cd8c3b3da78`
BLAKE2b-256	`e27234cd2b66be70bde384d9eb0f2032bb94b579ae47ea3e3c7d2b98a15ce0df`

See more details on using hashes here.

karaoke-gen 0.115.5

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Karaoke Generator 🎶 🎥 🚀

✨ Two Ways to Generate Karaoke

1. Local CLI (karaoke-gen)

2. Remote CLI (karaoke-gen-remote)

🎯 Features

Core Pipeline

Distribution Features

📦 Installation

Requirements

Transcription Provider Setup

Option 1: AudioShake (Recommended)

Option 2: Local Whisper (No Cloud Required)

Option 3: Whisper via RunPod

Without Transcription (Instrumental Only)

🖥️ Local CLI (karaoke-gen)

Basic Usage

Remote Audio Separation (Optional)

Key Options

Full Options Reference

☁️ Remote CLI (karaoke-gen-remote)

Setup

Basic Usage

Job Management

Full Production Run

Environment Variables

Authentication

Non-Interactive Mode

🎨 Style Configuration

📤 Output Files

🏗️ Deploy Your Own Backend

Prerequisites

Infrastructure Setup

Add Secret Values

Deploy Cloud Run

Point CLI to Your Backend

🔌 Backend API Reference

Job Submission

Job Status

List Jobs

Cancel Job

Delete Job

Lyrics Review

Instrumental Selection

Download Files

Health Check

🔧 Troubleshooting

"No suitable files found for processing"

Transcription Fails Silently

"No lyrics found from any source"

Video Quality Issues

Local Whisper Issues

GPU Out of Memory

Slow Transcription on CPU

Model Download Issues

whisper-timestamped Not Found

Disabling Local Whisper

🧪 Development

Running Tests

Project Structure

📄 License

🤝 Contributing

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

1. Local CLI (`karaoke-gen`)

2. Remote CLI (`karaoke-gen-remote`)

🖥️ Local CLI (`karaoke-gen`)

☁️ Remote CLI (`karaoke-gen-remote`)