Skip to main content

Dub any YouTube video into Hindi (and 19 other languages) with a casual, engaging neural voice.

Project description

YT Dubber

CI License: GPL v3 PRs Welcome Discussions Download

Want to try it or contribute? Fork this repo โ€” do not clone directly.

๐Ÿ”’ This project is GPL-3.0 licensed. You must credit the original author and keep your modifications open source. Forking for private/commercial use without releasing your source code is not permitted. See LICENSE.

YT Dubber Pipeline

Desktop app that dubs YouTube videos into Hindi (and 19 other languages) with a casual, engaging voice โ€” like an Indian tech YouTuber, not a flat textbook narrator.

Paste a YouTube URL, and the app plays the video while generating and playing a synced dub on top. It pulls the video's captions, translates them to natural spoken Hinglish, synthesizes neural speech, and plays everything in sync โ€” with a live transcript and on-screen subtitles.

A second Live Dub mode dubs any system audio in real time (browser, media player, calls) by capturing the audio device directly.


What it looks like

App UI Preview

The video plays in a separate mpv window (full quality). The Electron window is the control panel โ€” subtitles, progress, and volume sliders. Both stay in sync via mpv's IPC socket.


How It Works

How It Works


Two modes at a glance

Video URL Live Dub
Works on Linux, macOS, Windows Linux only
Source YouTube captions (any language) Live mic / system audio
Lag ~15s first run, instant on re-run Always ~3โ€“5s
Cache โœ… Saves MP3 + JSON per segment โŒ Real-time only
Best for YouTube courses, tutorials Streams, calls, any video

Source & Target Languages

  • Source: Any language โ€” the app auto-detects via the video's declared language and fetches its native captions. Pass --source-lang <code> (e.g. ar, zh, en) to force one.
  • Target: 20 languages. Hindi gets the full casual-creator Hinglish prompt. All other 19 get an energetic, conversational prompt in their own language.

Why mpv instead of a built-in video player

Electron's <video> element and Web Audio API crash with a SIGSEGV on Linux machines with Optimus (Intel + NVIDIA) graphics โ€” the GPU driver kills Chromium's software renderer the moment it tries to decode video or create an AudioContext.

The fix: hand all media off to mpv (a native video player) and control it via its JSON IPC socket. The Electron window becomes a pure HTML/CSS control panel that never touches the GPU โ€” it just shows subtitles, sliders, and status.


Requirements

  • OS โ€” Linux (primary). Video URL mode also runs on macOS and Windows (the mpv IPC uses a named pipe on Windows, a socket elsewhere). Live Dub is Linux-only (it shells out to PulseAudio parec/pacat).
  • mpv โ€” sudo apt install mpv (macOS brew install mpv; Windows: add mpv.exe to PATH)
  • yt-dlp โ€” pip install -U yt-dlp (or distro package)
  • ffmpeg โ€” sudo apt install ffmpeg
  • Node.js 18+ and npm (for the Electron frontend)
  • Python 3.10+
  • A free Groq API key
  • For Live Dub only: pulseaudio-utils (parec/pacat)

Setup

0. Fork & clone

# 1. Click "Fork" on GitHub first, then:
git clone https://github.com/<YOUR_USERNAME>/youtube-dubber.git
cd youtube-dubber
git remote add upstream https://github.com/Ashut90/youtube-dubber.git

Cloning the original repo directly means you can't contribute back. Fork first.

1. Backend (Python)

cd backend
pip3 install --break-system-packages -r requirements.txt
pip install -U yt-dlp        # must also be on PATH

Video URL mode only needs edge-tts + groq + yt-dlp. The extra numpy/silero-vad deps are for Live Dub mode.

2. Frontend (Electron)

cd frontend
npm install

3. Groq API key

export GROQ_API_KEY=gsk_xxxxxxxxxxxxxxxxxxxx

Get a free key at console.groq.com/keys. Export it in the same shell you launch the app from, so the spawned Python process inherits it.


Running โ€” Video URL mode (main feature)

cd frontend
export GROQ_API_KEY=gsk_xxxx
npm start
  1. Paste a YouTube URL and pick a language + voice.
  2. Click Dub It. An mpv window opens with the video (it starts paused while the first few seconds of dub are generated, then plays automatically).
  3. Watch the video in the mpv window; the dubbed audio + subtitles play through the app.

First run on a new video rebuilds the cache and is gated by Groq's rate limit (~5s between batches), so a long video takes a while to fully process โ€” but playback begins as soon as the opening buffer is ready, and the buffer-sync keeps audio aligned. Subsequent runs on the same video are near-instant thanks to the disk cache.

Tip: don't seek far ahead during the first pass โ€” the dub is generated sequentially from 0:00. After a full run, the cache covers the whole video and seeking anywhere works instantly. (The app also strips any &t= start-time from the URL and forces the video to start at 0:00 to stay aligned with generation.)


Running โ€” Live Dub mode (any system audio)

This dubs whatever is playing on your machine in real time.

1. Create a virtual audio sink

pactl load-module module-null-sink sink_name=DubCapture \
  sink_properties=device.description=DubCapture

2. Route the source audio to it

Open pavucontrol โ†’ Playback tab โ†’ set the browser/player output to DubCapture.

3. Start it

Use the Live Dub tab in the app, or run the backend directly:

cd backend
export GROQ_API_KEY=gsk_xxxx
python3 live_dub_v6.py --lang hindi --gender male

The 5-agent pipeline (capture โ†’ Groq Whisper STT โ†’ Groq translate โ†’ edge-tts โ†’ playback) prints the live transcript and plays the dub through your speakers within a few seconds of each phrase. Ctrl+C to stop.

Lag floor ~3โ€“5s is inherent to live mode: a full phrase must be heard before it can be transcribed and translated.


Use as a Python library

The dubbing engine ships as an installable package โ€” use it in your own code, scripts, or server with no GUI.

pip install youtube-dubber      # plus: yt-dlp + ffmpeg on PATH, and a Groq key
export GROQ_API_KEY=gsk_xxxx
from youtube_dubber import dub

# One call โ†’ dubbed MP3 clips + a manifest.json in ./out
manifest = dub(
    "https://www.youtube.com/watch?v=VIDEO_ID",
    lang="hindi",      # any of the 20 supported languages
    gender="female",   # "male" or "female"
    out="./out",
)
print(len(manifest["segments"]), "segments dubbed")

Want live progress? Pass an on_event callback:

from youtube_dubber import Dubber

def on_event(ev):
    if ev["type"] == "progress":
        print(ev["step"], ev["pct"], ev["msg"])
    elif ev["type"] == "segment":
        print("dubbed:", ev["dubbed"])

Dubber(lang="hindi", gender="male", out_dir="./out", on_event=on_event).run(url)

Or from the command line:

youtube-dubber --url https://youtu.be/VIDEO_ID --lang hindi --gender female --out ./out

Output: out/audio/<videoId>_<lang>_<gender>/seg_NNNNN.mp3 clips + out/manifest.json. The Electron desktop app is just one consumer of this same engine.


Supported languages

Hindi, Tamil, Telugu, Bengali, Marathi, Gujarati, Kannada, Malayalam, Punjabi, Urdu, Spanish, French, German, Japanese, Chinese, Korean, Arabic, Portuguese, Russian, Italian.

Each has a configured male/female edge-tts neural voice. Indian languages use a Hinglish/code-switch style (technical terms stay English); others translate fully. See backend/youtube_dubber/languages.py.


Project layout

youtube-dubber/
โ”œโ”€โ”€ pyproject.toml             pip package config (youtube-dubber)
โ”œโ”€โ”€ setup.sh / setup.bat       one-command dependency installers
โ”‚
โ”œโ”€โ”€ frontend/                  Electron desktop app
โ”‚   โ”œโ”€โ”€ main.js                main process: mpv control, dub queue, spawns Python
โ”‚   โ”œโ”€โ”€ preload.js             contextBridge IPC API
โ”‚   โ”œโ”€โ”€ package.json           electron + build config
โ”‚   โ””โ”€โ”€ renderer/
โ”‚       โ”œโ”€โ”€ index.html         control-panel UI
โ”‚       โ”œโ”€โ”€ app.js             buffer/sync logic, subtitle + playback scheduling
โ”‚       โ””โ”€โ”€ styles.css
โ”‚
โ””โ”€โ”€ backend/
    โ”œโ”€โ”€ youtube_dubber/        โ˜… the installable dubbing-engine package
    โ”‚   โ”œโ”€โ”€ __init__.py        public API: Dubber, dub, LANGUAGES
    โ”‚   โ”œโ”€โ”€ core.py            the engine (captions โ†’ translate โ†’ TTS โ†’ cache)
    โ”‚   โ”œโ”€โ”€ cli.py             `youtube-dubber` command + `python -m youtube_dubber`
    โ”‚   โ””โ”€โ”€ languages.py       20-language registry (voices, script, style)
    โ”œโ”€โ”€ dub_video.py           thin Electron adapter โ†’ calls the package
    โ”œโ”€โ”€ live_dub_v6.py         Live mode: 5-agent real-time pipeline
    โ”œโ”€โ”€ natural_tts.py         edge-tts wrapper used by Live Dub
    โ”œโ”€โ”€ vad.py                 Silero voice-activity detection (Live Dub)
    โ”œโ”€โ”€ languages.py           compatibility shim โ†’ youtube_dubber.languages
    โ””โ”€โ”€ requirements.txt

Configuration

Translation / TTS โ€” backend/youtube_dubber/core.py:

Knob Where Effect
BATCH_SIZE top of file Segments per Groq call (fewer API round-trips vs. faster first audio)
model client.chat.completions.create(...) llama-3.1-8b-instant (reliable) vs. llama-3.3-70b-versatile (more casual, stricter rate limit)
stretch() cap stretch() Max speed-up to fit a clip to its window (default 1.4ร—)
slang map _SLANG dict Formal โ†’ casual phrase replacements
TTS rate/pitch _synth_one() edge-tts rate/pitch per language

Playback sync โ€” frontend/renderer/app.js:

Knob Effect
START_BUFFER Seconds of dub ready before first play (default 6)
PAUSE_GAP / RESUME_GAP When to pause/resume mpv as it approaches the generation frontier
STALE_DROP (in main.js) Drop a dub clip if its segment ended this far behind the playhead

Voices โ€” backend/languages.py (voice_male / voice_female per language).


Known limitations & trade-offs

  • mpv plays in its own window, not embedded in the app โ€” a deliberate workaround for the Chromium SIGSEGV on Optimus systems.
  • 8b translations are sometimes verbose, so an occasional long sentence finishes slightly late or gets dropped to stay in sync (never cut mid-word). The cure is shorter translations / a stronger prompt.
  • Hindi is the best-tuned target. The other 19 languages get a casual prompt too, but quality depends on the 8b model's fluency in that language.
  • Non-English source relies on the video having captions in its own language (or being detectable by Whisper). Auto-detection via the declared language is reliable but not infallible โ€” use --source-lang to force it.
  • Live Dub is Linux-only (PulseAudio); the Video URL mode is the cross-platform one.
  • Groq free-tier rate limits make the first pass on a long video slow; the cache makes re-runs fast.
  • edge-tts needs internet (it calls Microsoft's servers).
  • Live Dub lag floor ~3โ€“5s is inherent to listen-then-translate.
  • Mixed Devanagari + Latin script can occasionally trip TTS intonation.

Troubleshooting

Symptom Likely cause / fix
No dub audio at all Groq key not exported in the launch shell; wait ~20โ€“30s for first batch to clear the rate limit.
Female voice sounds like first run is slow It is โ€” female cache is separate from male. First female run generates everything fresh; re-runs are instant.
Dub plays but goes silent after a while Video ran ahead of generation. Don't seek ahead on first pass; let the buffer-sync handle pacing.
Live Dub plays nothing Make sure you routed your app's audio to DubCapture in pavucontrol first.
Live Dub crashes immediately on macOS/Windows Live Dub is Linux-only (PulseAudio). Use Video URL mode on other OS.
mpv window doesn't open mpv not installed โ€” sudo apt install mpv.
Dub reads out URLs or bash commands Cleared by clean_text(); if it persists, delete the cache and re-run.
Want fresh translations for a video rm -rf ~/.config/yt-dubber/dubout/audio/<videoId>_<lang>_<gender>/
Want to force a specific source language Add --source-lang ar (or zh, en, etc.) to the Python CLI or wait for UI support.

Cache location: ~/.config/yt-dubber/dubout/audio/.


Contributing & Discussions

Found a bug? Have an idea? Want to add a language or improve the dubbing quality?

Don't just fork silently โ€” come talk:

If you've built something on top of this project, share it in Discussions โ€” I want to see what people are creating.


Credits & Attribution

Original project: YT Dubber
Author: ahsutosh
License: GPL-3.0 โ€” see LICENSE

If you fork this project, you must:

  1. Keep this credits section or link back to this repo
  2. State clearly what you changed
  3. License your fork under GPL-3.0

Built with: Groq API ยท edge-tts ยท mpv ยท yt-dlp ยท Electron

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

youtube_dubber-0.1.0.tar.gz (26.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

youtube_dubber-0.1.0-py3-none-any.whl (22.1 kB view details)

Uploaded Python 3

File details

Details for the file youtube_dubber-0.1.0.tar.gz.

File metadata

  • Download URL: youtube_dubber-0.1.0.tar.gz
  • Upload date:
  • Size: 26.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for youtube_dubber-0.1.0.tar.gz
Algorithm Hash digest
SHA256 147dac1fa35a333f59f374a55bbd8762e32db51d796100d46f93b92a3ab34309
MD5 04a36e98bf3ecfc7987937865b120568
BLAKE2b-256 064f8a64f5d1e1da562b730e16d7f8b5c6f7326eae86dbfb188eecaf2b074311

See more details on using hashes here.

File details

Details for the file youtube_dubber-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: youtube_dubber-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for youtube_dubber-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f1e8ecfcc25b79e79036e51ba86185aa81e2e9980b311f3aa80b6c1cc300f926
MD5 cbaced9971400af8ca22026513cec2a7
BLAKE2b-256 468c0296876fd2914fc757aa39b79613c8fad3b13cb1efee4c8af469de382db3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page