Skip to main content

A command-line tool for text-to-speech generation using Chatterbox TTS

Project description

Voice Forge

A powerful command-line interface for Chatterbox TTS - Resemble AI's state-of-the-art open-source Text-to-Speech model.

Features

  • 🎯 Simple CLI interface - Generate speech from text with a single command
  • 🔊 Automatic audio playback - Hear your generated speech immediately
  • 💾 Audio file export - Save generated speech to WAV files
  • 🎭 Voice cloning - Use reference audio files for voice conversion
  • ⚙️ Customizable parameters - Control exaggeration and CFG weight
  • 📄 File input support - Read text from files
  • 🖥️ Cross-platform - Works on macOS, Linux, and Windows
  • 🎮 Multiple audio backends - Supports pygame, playsound, and system audio players

Installation

Prerequisites

  • Python 3.8 or higher
  • CUDA (optional, for GPU acceleration)

Install Dependencies

# Install core dependencies
pip install chatterbox-tts torch torchaudio

# Install optional audio playback libraries
pip install pygame playsound

# Or install all dependencies at once
pip install -r requirements.txt

Install Voice Forge

# Install from PyPI (once published)
pip install voice-forge

# Or install from source
pip install -e .

# Or run directly from the package
python -m voice_forge --help

Usage

Basic Usage

# Generate and play speech from text
voice-forge "Hello, world! This is Voice Forge with Chatterbox TTS."

# Save audio to file
voice-forge "Hello, world!" --save output.wav

# Read text from file
voice-forge --file input.txt --save output.wav

Voice Cloning

# Use a reference voice
voice-forge "Hello, world!" --voice reference.wav

# Combine voice cloning with file output
voice-forge "Hello, world!" --voice reference.wav --save cloned_output.wav

Advanced Parameters

# Adjust exaggeration and CFG weight
voice-forge "Hello, world!" --exaggeration 0.7 --cfg-weight 0.3

# Use CPU instead of GPU
voice-forge "Hello, world!" --device cpu

# Save without playing
voice-forge "Hello, world!" --save output.wav --no-play

Audio Playback Options

# Use specific audio backend
voice-forge "Hello, world!" --audio-method pygame
voice-forge "Hello, world!" --audio-method playsound
voice-forge "Hello, world!" --audio-method system

Command Line Options

Option Short Description Default
text - Text to convert to speech -
--file -f Read text from file -
--voice -v Path to reference audio file for voice cloning -
--exaggeration -e Exaggeration/intensity control (0.0-1.0) 0.5
--cfg-weight -c CFG weight for generation control (0.0-1.0) 0.5
--save -s Save generated audio to file -
--no-play - Don't play audio, only save to file False
--audio-method - Audio playback method (auto/pygame/playsound/system) auto
--device - Device to run model on (auto/cpu/cuda) auto
--verbose -V Enable verbose output False

Examples

Basic Text-to-Speech

voice-forge "Welcome to Voice Forge with Chatterbox TTS!"

Gaming Voice Lines

voice-forge "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill."

Expressive Speech

voice-forge "This is amazing!" --exaggeration 0.8 --cfg-weight 0.2

Voice Conversion

voice-forge "Hello, this is my cloned voice!" --voice my_voice_sample.wav

Batch Processing

# Create a text file with your content
echo "This is a longer text that I want to convert to speech." > input.txt
voice-forge --file input.txt --save batch_output.wav

Tips for Best Results

General Use (TTS and Voice Agents)

  • The default settings (exaggeration=0.5, cfg_weight=0.5) work well for most prompts
  • If the reference speaker has a fast speaking style, try lowering cfg_weight to around 0.3

Expressive or Dramatic Speech

  • Use lower cfg_weight values (e.g., ~0.3) and increase exaggeration to around 0.7 or higher
  • Higher exaggeration tends to speed up speech; reducing cfg_weight helps compensate with slower, more deliberate pacing

Voice Cloning

  • Use high-quality reference audio (clear speech, minimal background noise)
  • Reference audio should be at least 3-10 seconds long
  • WAV format is preferred for reference files

Troubleshooting

Installation Issues

If you encounter import errors:

# Make sure all dependencies are installed
pip install chatterbox-tts torch torchaudio pygame playsound

# On macOS, you might need:
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu

Audio Playback Issues

If audio doesn't play:

# Try different audio methods
voice-forge "test" --audio-method system
voice-forge "test" --audio-method pygame
voice-forge "test" --audio-method playsound

# Or just save to file and play manually
voice-forge "test" --save test.wav --no-play

CUDA Issues

If you have CUDA issues:

# Force CPU mode
voice-forge "test" --device cpu

License

This project is based on Chatterbox TTS by Resemble AI, which is licensed under the MIT License.

Acknowledgments

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Disclaimer

This tool is for educational and research purposes. Please use responsibly and follow all applicable laws and ethics guidelines when generating synthetic speech.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voice_forge-1.0.0.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voice_forge-1.0.0-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file voice_forge-1.0.0.tar.gz.

File metadata

  • Download URL: voice_forge-1.0.0.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for voice_forge-1.0.0.tar.gz
Algorithm Hash digest
SHA256 9c33af43270fd8c80b2add18535af36b8867228f857ef26a417bd00f4c1ccd82
MD5 77ed5b910384914050184f8bf7bcb5a1
BLAKE2b-256 894f4c73dee29d80fc078e2e73daa16ed48fa4fa1faa7bcb0613e9936eaec62d

See more details on using hashes here.

File details

Details for the file voice_forge-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: voice_forge-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for voice_forge-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 24513d19fbc144ed9323bbc34a801a724e5fe333e458189325c497e4e2f587c4
MD5 5cc7b5483b2d741f1f2082ef4b8e7690
BLAKE2b-256 62e79ab003b39cafa0b1891f4c532e3b3a36bd6e96764c89710c47b12a61cef6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page