A command-line tool for text-to-speech generation using Chatterbox TTS
Project description
Voice Forge
A powerful command-line interface for Chatterbox TTS - Resemble AI's state-of-the-art open-source Text-to-Speech model.
Features
- 🎯 Simple CLI interface - Generate speech from text with a single command
- 🔊 Automatic audio playback - Hear your generated speech immediately
- 💾 Audio file export - Save generated speech to WAV files
- 🎭 Voice cloning - Use reference audio files for voice conversion
- ⚙️ Customizable parameters - Control exaggeration and CFG weight
- 📄 File input support - Read text from files
- 🖥️ Cross-platform - Works on macOS, Linux, and Windows
- 🎮 Multiple audio backends - Supports pygame, playsound, and system audio players
Installation
Prerequisites
- Python 3.8 or higher
- CUDA (optional, for GPU acceleration)
Install Dependencies
# Install core dependencies
pip install chatterbox-tts torch torchaudio
# Install optional audio playback libraries
pip install pygame playsound
# Or install all dependencies at once
pip install -r requirements.txt
Install Voice Forge
# Install from PyPI (once published)
pip install voice-forge
# Or install from source
pip install -e .
# Or run directly from the package
python -m voice_forge --help
Usage
Basic Usage
# Generate and play speech from text
voice-forge "Hello, world! This is Voice Forge with Chatterbox TTS."
# Save audio to file
voice-forge "Hello, world!" --save output.wav
# Read text from file
voice-forge --file input.txt --save output.wav
Voice Cloning
# Use a reference voice
voice-forge "Hello, world!" --voice reference.wav
# Combine voice cloning with file output
voice-forge "Hello, world!" --voice reference.wav --save cloned_output.wav
Advanced Parameters
# Adjust exaggeration and CFG weight
voice-forge "Hello, world!" --exaggeration 0.7 --cfg-weight 0.3
# Use CPU instead of GPU
voice-forge "Hello, world!" --device cpu
# Save without playing
voice-forge "Hello, world!" --save output.wav --no-play
Audio Playback Options
# Use specific audio backend
voice-forge "Hello, world!" --audio-method pygame
voice-forge "Hello, world!" --audio-method playsound
voice-forge "Hello, world!" --audio-method system
Command Line Options
| Option | Short | Description | Default |
|---|---|---|---|
text |
- | Text to convert to speech | - |
--file |
-f |
Read text from file | - |
--voice |
-v |
Path to reference audio file for voice cloning | - |
--exaggeration |
-e |
Exaggeration/intensity control (0.0-1.0) | 0.5 |
--cfg-weight |
-c |
CFG weight for generation control (0.0-1.0) | 0.5 |
--save |
-s |
Save generated audio to file | - |
--no-play |
- | Don't play audio, only save to file | False |
--audio-method |
- | Audio playback method (auto/pygame/playsound/system) | auto |
--device |
- | Device to run model on (auto/cpu/cuda) | auto |
--verbose |
-V |
Enable verbose output | False |
Examples
Basic Text-to-Speech
voice-forge "Welcome to Voice Forge with Chatterbox TTS!"
Gaming Voice Lines
voice-forge "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill."
Expressive Speech
voice-forge "This is amazing!" --exaggeration 0.8 --cfg-weight 0.2
Voice Conversion
voice-forge "Hello, this is my cloned voice!" --voice my_voice_sample.wav
Batch Processing
# Create a text file with your content
echo "This is a longer text that I want to convert to speech." > input.txt
voice-forge --file input.txt --save batch_output.wav
Tips for Best Results
General Use (TTS and Voice Agents)
- The default settings (
exaggeration=0.5,cfg_weight=0.5) work well for most prompts - If the reference speaker has a fast speaking style, try lowering
cfg_weightto around0.3
Expressive or Dramatic Speech
- Use lower
cfg_weightvalues (e.g.,~0.3) and increaseexaggerationto around0.7or higher - Higher
exaggerationtends to speed up speech; reducingcfg_weighthelps compensate with slower, more deliberate pacing
Voice Cloning
- Use high-quality reference audio (clear speech, minimal background noise)
- Reference audio should be at least 3-10 seconds long
- WAV format is preferred for reference files
Troubleshooting
Installation Issues
If you encounter import errors:
# Make sure all dependencies are installed
pip install chatterbox-tts torch torchaudio pygame playsound
# On macOS, you might need:
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu
Audio Playback Issues
If audio doesn't play:
# Try different audio methods
voice-forge "test" --audio-method system
voice-forge "test" --audio-method pygame
voice-forge "test" --audio-method playsound
# Or just save to file and play manually
voice-forge "test" --save test.wav --no-play
CUDA Issues
If you have CUDA issues:
# Force CPU mode
voice-forge "test" --device cpu
License
This project is based on Chatterbox TTS by Resemble AI, which is licensed under the MIT License.
Acknowledgments
- Resemble AI for creating Chatterbox TTS
- Chatterbox TTS Repository
- The original Chatterbox research and development team
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Disclaimer
This tool is for educational and research purposes. Please use responsibly and follow all applicable laws and ethics guidelines when generating synthetic speech.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voice_forge-1.0.0.tar.gz.
File metadata
- Download URL: voice_forge-1.0.0.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c33af43270fd8c80b2add18535af36b8867228f857ef26a417bd00f4c1ccd82
|
|
| MD5 |
77ed5b910384914050184f8bf7bcb5a1
|
|
| BLAKE2b-256 |
894f4c73dee29d80fc078e2e73daa16ed48fa4fa1faa7bcb0613e9936eaec62d
|
File details
Details for the file voice_forge-1.0.0-py3-none-any.whl.
File metadata
- Download URL: voice_forge-1.0.0-py3-none-any.whl
- Upload date:
- Size: 9.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24513d19fbc144ed9323bbc34a801a724e5fe333e458189325c497e4e2f587c4
|
|
| MD5 |
5cc7b5483b2d741f1f2082ef4b8e7690
|
|
| BLAKE2b-256 |
62e79ab003b39cafa0b1891f4c532e3b3a36bd6e96764c89710c47b12a61cef6
|