AI dubbing system which uses machine learning models to automatically translate and synchronize audio dialogue into different languages.

These details have not been verified by PyPI

Project links

Homepage

Project description

Introduction

Open dubbing is an AI dubbing system uses machine learning models to automatically translate and synchronize audio dialogue into different languages. It is designed as a command line tool.

At the moment, it is pure experimental and an excuse to help me to understand better STT, TTS and translation systems combined together.

If you want to see a live system running you can do it at https://www.softcatala.org/doblatge/ (accepts only English and Spanish an dubs only to Catalan). It combines this project, https://github.com/Softcatala/subdub-editor (an editor) and https://github.com/Softcatala/dubbing-service (web service).

Features

Build on top of open source models and able to run it locally
Dubs automatically a video from a source to a target language
Supports multiple Text To Speech (TTS): Coqui, MMS, Edge, OpenAI TTS
Allows to use any non-supported one by configuring an API or CLI
Gender voice detection to allow to assign properly synthetic voice
Support for multiple translation engines (Meta's NLLB, Apertium API, etc)
Automatic detection of the source language of the video (using Whisper)

Roadmap

Areas what we will like to explore:

Better control of voice used for dubbing
Optimize it for long videos and less resource usage
Support for multiple video input formats

Demo

This video on propose shows the strengths and limitations of the system.

Original English video

https://github.com/user-attachments/assets/54c0d37f-0cc8-4ea2-8f8d-fd2d2f4eeccc

Automatic dubbed video in Catalan

https://github.com/user-attachments/assets/99936655-5851-4d0c-827b-f36f79f56190

Limitations

This is an experimental project
Automatic video dubbing includes speech recognition, translation, vocal recognition, etc. At each one of these steps errors can be introduced

Supported languages

The support languages depends on the combination of text to speech, translation system and text to speech system used. With Coqui TTS, these are the languages supported (I only tested a very few of them):

Supported source languages: Afrikaans, Amharic, Armenian, Assamese, Bashkir, Basque, Belarusian, Bengali, Bosnian, Bulgarian, Burmese, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Gujarati, Haitian, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Lingala, Lithuanian, Luxembourgish, Macedonian, Malayalam, Maltese, Maori, Marathi, Modern Greek (1453-), Norwegian Nynorsk, Occitan (post 1500), Panjabi, Polish, Portuguese, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Vietnamese, Welsh, Yoruba, Yue Chinese

Supported target languages: Achinese, Akan, Amharic, Assamese, Awadhi, Ayacucho Quechua, Balinese, Bambara, Bashkir, Basque, Bemba (Zambia), Bengali, Bulgarian, Burmese, Catalan, Cebuano, Central Aymara, Chhattisgarhi, Crimean Tatar, Dutch, Dyula, Dzongkha, English, Ewe, Faroese, Fijian, Finnish, Fon, French, Ganda, German, Guarani, Gujarati, Haitian, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Iloko, Indonesian, Javanese, Kabiyè, Kabyle, Kachin, Kannada, Kazakh, Khmer, Kikuyu, Kinyarwanda, Kirghiz, Korean, Lao, Magahi, Maithili, Malayalam, Marathi, Minangkabau, Modern Greek (1453-), Mossi, North Azerbaijani, Northern Kurdish, Nuer, Nyanja, Odia, Pangasinan, Panjabi, Papiamento, Polish, Portuguese, Romanian, Rundi, Russian, Samoan, Sango, Shan, Shona, Somali, South Azerbaijani, Southwestern Dinka, Spanish, Sundanese, Swahili (individual language), Swedish, Tagalog, Tajik, Tamasheq, Tamil, Tatar, Telugu, Thai, Tibetan, Tigrinya, Tok Pisin, Tsonga, Turkish, Turkmen, Uighur, Ukrainian, Urdu, Vietnamese, Waray (Philippines), Welsh, Yoruba

Installation

To install the open_dubbing in all platforms:

pip install open_dubbing

If you want to install also Coqui-tts, do:

pip install open_dubbing[coqui]

If you want to install also OpenIA support, do:

pip install open_dubbing[openai]

Linux additional dependencies

In Linux you also need to install:

sudo apt install ffmpeg

If you are going to use Coqui-tts you also need to install espeak-ng:

sudo apt install espeak-ng

macOS additional dependencies

In macOS you also need to install:

brew install ffmpeg

If you are going to use Coqui-tts you also need to install espeak-ng:

brew install espeak-ng

Windows additional dependencies

Windows currently works but it has not been tested extensively.

You also need to install ffmpeg for Windows. Make sure that is the system path.

Accept pyannote license

Go to and Accept pyannote/segmentation-3.0 user conditions
Accept pyannote/speaker-diarization-3.1 user conditions
Go to and access token at hf.co/settings/tokens.

Quick start

 open-dubbing --input_file video.mp4 --target_language=cat --hugging_face_token=TOKEN

Where:

TOKEN is the HuggingFace token that allows to access the models
cat in this case is the target language using iso ISO 639-3 language codes

By default, the source language is predicted using the first 30 seconds of the video. If this does not work (e.g. there is only music at the beginning), use the parameter source_language to specify the source language using ISO 639-3 language codes (e.g. 'eng' for English).

To get a list of available options:

open-dubbing --help

Post editing automatic generated dubbed files

There are cases where you want to manually adjust the text generated automatically for dubbing, the voice used or the timings.

After you have executed open-dubbing you have the intermediate files and the outcome dubbed file in the selected output directory.

You can edit the file utterance_metadata_XXX.json (where XXX is the target language code), make manual adjustments, and generate the video again.

See an example JSON:

    "utterances": [
        {
            "start": 7.607843750000001,
            "end": 8.687843750000003,
            "speaker_id": "SPEAKER_00",
            "path": "short/chunk_7.607843750000001_8.687843750000003.mp3",
            "text": "And I love this city.",
            "for_dubbing": true,
            "gender": "Male",
            "translated_text": **"I m'encanta aquesta ciutat."**,
            "assigned_voice": "ca-ES-EnricNeural",
            "speed": 1.3,
            "dubbed_path": "short/dubbed_chunk_7.607843750000001_8.687843750000003.mp3",
            "hash": "b11d7f0e2aa5475e652937469d89ef0a178fecea726f076095942d552944089f"
        },

Imagine that you have changed the translated_text. To generated the post-edited video:

 open-dubbing --input_file video.mp4 --target_language=cat --hugging_face_token=TOKEN --update

The update parameter changes the behavior of open-dubbing and instead of producing a full dubbing it rebuilds the already existing dubbing incorporating any change made into the JSON file.

Fields that are usefull to modify are: translated_text, gender (of the voice) or speed.

Documentation

For more detailed documentation on how the tool works and how to use it, see our documentation page.

Appreciation

Core libraries used:

demucs to separate vocals from the audio
pyannote-audio to diarize speakers
faster-whisper for audio to speech
NLLB-200 for machine translation
TTS
- coqui-tts
- Meta mms
- Microsoft Edge TTS
- OpenAI TTS

And very special thanks to ariel from which we leveraged parts of their code base.

License

See license

Contact

Email address: Jordi Mas: jmas@softcatala.org

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.2.3

Jun 26, 2025

0.2.2

Jun 20, 2025

0.2.1

Feb 10, 2025

0.2.0

Jan 21, 2025

0.1.9

Jan 13, 2025

0.1.8

Jan 6, 2025

0.1.7

Dec 31, 2024

0.1.6

Dec 22, 2024

0.1.5

Dec 17, 2024

0.1.4

Dec 6, 2024

0.1.3

Nov 23, 2024

0.1.2

Nov 14, 2024

0.1.1

Nov 2, 2024

0.1.0

Oct 28, 2024

0.0.9

Oct 25, 2024

0.0.8

Oct 18, 2024

0.0.7

Oct 11, 2024

0.0.6

Oct 7, 2024

0.0.5

Oct 4, 2024

0.0.4

Sep 29, 2024

0.0.3

Sep 27, 2024

0.0.2

Sep 22, 2024

0.0.1

Sep 16, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_dubbing-0.2.3.tar.gz (60.3 kB view details)

Uploaded Jun 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

open_dubbing-0.2.3-py3-none-any.whl (75.5 kB view details)

Uploaded Jun 26, 2025 Python 3

File details

Details for the file open_dubbing-0.2.3.tar.gz.

File metadata

Download URL: open_dubbing-0.2.3.tar.gz
Upload date: Jun 26, 2025
Size: 60.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for open_dubbing-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`a2a5f24713224c74b9eb9964d03351274222df48c9835da155a2153acd84a6e6`
MD5	`fbcdd7883d09099240aeb85c73d3ac90`
BLAKE2b-256	`9236325e107116420dff7dd0c5acdaa5fcdd82b53082801d06c8833c08a3bc0c`

See more details on using hashes here.

File details

Details for the file open_dubbing-0.2.3-py3-none-any.whl.

File metadata

Download URL: open_dubbing-0.2.3-py3-none-any.whl
Upload date: Jun 26, 2025
Size: 75.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for open_dubbing-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`70b2ad178ecff74217ceeaacd6bb535ff03423f7e7175f544eaf52a375a95b4f`
MD5	`67c605955d91eca12b83303e15caf729`
BLAKE2b-256	`db11bcfbc5e881d8a5acad025b779aae73cf8a234aeb7f03e7f8cc5767098312`

See more details on using hashes here.

open-dubbing 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Introduction

Features

Roadmap

Demo

Limitations

Supported languages

Installation

Linux additional dependencies

macOS additional dependencies

Windows additional dependencies

Accept pyannote license

Quick start

Post editing automatic generated dubbed files

Documentation

Appreciation

License

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes