Python text-to-speech library with built-in voice effects and support for multiple TTS engines.
Project description
voicebox
Python text-to-speech library with built-in voice effects and support for multiple TTS engines.
| GitHub | Documentation 📘 | Audio Samples 🔉 |
# Example: Use gTTS with a vocoder effect to speak in a robotic voice
from voicebox import SimpleVoicebox
from voicebox.tts import gTTS
from voicebox.effects import Vocoder, Normalize
voicebox = SimpleVoicebox(
tts=gTTS(),
effects=[Vocoder.build(), Normalize()],
)
voicebox.say('Hello, world! How are you today?')
Setup
pip install voicebox-tts
- Install the
PortAudio
library for audio playback.- On Debian/Ubuntu:
sudo apt install libportaudio2
- On Debian/Ubuntu:
- Install dependencies for whichever TTS engine(s) you want to use (see section below).
Supported Text-to-Speech Engines
Classes for supported TTS engines are located in the
voicebox.tts
package.
Amazon Polly 🌐
Online TTS engine from AWS.
- Class:
voicebox.tts.AmazonPolly
- Setup:
pip install "voicebox-tts[amazon-polly]"
ElevenLabs 🌐
Online TTS engine with very realistic voices and support for voice cloning.
- Class:
voicebox.tts.ElevenLabsTTS
- Setup:
eSpeak NG 🌐
Offline TTS engine with a good number of options.
- Class:
voicebox.tts.ESpeakNG
- Setup:
- On Debian/Ubuntu:
sudo apt install espeak-ng
- On Debian/Ubuntu:
Google Cloud Text-to-Speech 🌐
Powerful online TTS engine offered by Google Cloud.
- Class:
voicebox.tts.GoogleCloudTTS
- Setup:
pip install "voicebox-tts[google-cloud-tts]"
gTTS 🌐
Online TTS engine used by Google Translate.
- Class:
voicebox.tts.gTTS
- Setup:
pip install "voicebox-tts[gtts]"
- Install ffmpeg or libav for
pydub
(docs)
🤗 Parler TTS 🌐
Offline TTS engine released by Hugging Face that uses a promptable deep learning model to generate speech.
- Class:
voicebox.tts.ParlerTTS
- Setup:
pip install git+https://github.com/huggingface/parler-tts.git
Pico TTS
Very basic offline TTS engine.
- Class:
voicebox.tts.PicoTTS
- Setup:
- On Debian/Ubuntu:
sudo apt install libttspico-utils
- On Debian/Ubuntu:
pyttsx3 🌐
Offline TTS engine wrapper with support for the built-in TTS engines on Windows (SAPI5) and macOS (NSSpeechSynthesizer), as well as espeak on Linux. By default, it will use the most appropriate engine for your platform.
- Class:
voicebox.tts.Pyttsx3TTS
- Setup:
pip install "voicebox-tts[pyttsx3]"
- On Debian/Ubuntu:
sudo apt install espeak
Effects
Built-in effect classes are located in the
voicebox.effects
package,
and can be imported like:
from voicebox.effects import CoolEffect
Here is a non-exhaustive list of fun effects:
Glitch
creates a glitchy sound by randomly repeating small chunks of audio.RingMod
can be used to create choppy, Doctor Who Dalek-like effects.Vocoder
is useful for making monotone, robotic voices.
There is also support for all the awesome audio plugins in
Spotify's pedalboard
library
using the special PedalboardEffect
wrapper, e.g.:
from voicebox import SimpleVoicebox
from voicebox.effects import PedalboardEffect
import pedalboard
voicebox = SimpleVoicebox(
effects=[
PedalboardEffect(pedalboard.Reverb()),
...,
]
)
Examples
Minimal
# PicoTTS is used to say "Hello, world!"
from voicebox import SimpleVoicebox
voicebox = SimpleVoicebox()
voicebox.say('Hello, world!')
Pre-built
Some pre-built voiceboxes are available in the
voicebox.examples
package.
They can be imported into your own code, and you can run them to demo:
# Voice of GLaDOS from the Portal video game series
python -m voicebox.examples.glados "optional message"
# Voice of the OOM-9 command battle droid from Star Wars: Episode I
python -m voicebox.examples.battle_droid "optional message"
Advanced
# Use eSpeak NG at 120 WPM and en-us voice as the TTS engine
from voicebox import reliable_tts
from voicebox.tts import ESpeakConfig, ESpeakNG, gTTS
# Wrap multiple TTSs in retries and caches
tts = reliable_tts(
ttss=[
# Prefer using online TTS first
gTTS(),
# Fall back to offline TTS if online TTS fails
ESpeakNG(ESpeakConfig(speed=120, voice='en-us')),
],
)
# Add some voice effects
from voicebox.effects import Vocoder, Glitch, Normalize
effects = [
Vocoder.build(), # Make a robotic, monotone voice
Glitch(), # Randomly repeat small sections of audio
Normalize(), # Remove DC and make volume consistent
]
# Build audio sink
from voicebox.sinks import Distributor, SoundDevice, WaveFile
sink = Distributor([
SoundDevice(), # Send audio to playback device
WaveFile('speech.wav'), # Save audio to speech.wav file
])
# Build the voicebox
from voicebox import ParallelVoicebox
from voicebox.voiceboxes.splitter import SimpleSentenceSplitter
# Parallel voicebox doesn't block the main thread
voicebox = ParallelVoicebox(
tts,
effects,
sink,
# Split text into sentences to reduce time to first speech
text_splitter=SimpleSentenceSplitter(),
)
# Speak!
voicebox.say('Hello, world!')
# Wait for all audio to finish playing before exiting
voicebox.wait_until_done()
Command Line Demo
python -m voicebox -h # Print command help
python -m voicebox "Hello, world!" # Basic usage
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file voicebox_tts-0.0.11.tar.gz
.
File metadata
- Download URL: voicebox_tts-0.0.11.tar.gz
- Upload date:
- Size: 36.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ec7fd710768afa92ea5c4ee85c61432abc9feb22d5ce1811869a09dd6702266 |
|
MD5 | 2541a034492bf4ae0c17c5d8f1735d68 |
|
BLAKE2b-256 | 8f2fc1430cda3449d624e4c230a38e0fa6dd6fec3978541bbb5dbfab7d777548 |
File details
Details for the file voicebox_tts-0.0.11-py3-none-any.whl
.
File metadata
- Download URL: voicebox_tts-0.0.11-py3-none-any.whl
- Upload date:
- Size: 47.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f7d5f3004517ae17dc744e2f6cf7cc11bed550c1e2808fbb54c2e61e58c86f5 |
|
MD5 | 1db8905d11eeee7cf1faa53b61534449 |
|
BLAKE2b-256 | ee625dfa7ade9fce369b0abfaa216bfdc22839518a954660906cbdeb8e191c3a |