A command-line tool for text-to-speech generation using Chatterbox TTS

These details have not been verified by PyPI

Project links

Project description

Voice Forge

A powerful command-line interface for Chatterbox TTS - Resemble AI's state-of-the-art open-source Text-to-Speech model.

Features

🎯 Simple CLI interface - Generate speech from text with a single command
🔊 Automatic audio playback - Hear your generated speech immediately
💾 Audio file export - Save generated speech to WAV files
🎭 Voice cloning - Use reference audio files for voice conversion
⚙️ Customizable parameters - Control exaggeration and CFG weight
📄 File input support - Read text from files
🖥️ Cross-platform - Works on macOS, Linux, and Windows
🎮 Multiple audio backends - Supports pygame, playsound, and system audio players

Installation

Prerequisites

Python 3.8 or higher
CUDA (optional, for GPU acceleration)

Install Dependencies

# Install core dependencies
pip install chatterbox-tts torch torchaudio

# Install optional audio playback libraries
pip install pygame playsound

# Or install all dependencies at once
pip install -r requirements.txt

Install Voice Forge

# Install from PyPI (once published)
pip install voice-forge

# Or install from source
pip install -e .

# Or run directly from the package
python -m voice_forge --help

Usage

Basic Usage

# Generate and play speech from text
voice-forge "Hello, world! This is Voice Forge with Chatterbox TTS."

# Save audio to file
voice-forge "Hello, world!" --save output.wav

# Read text from file
voice-forge --file input.txt --save output.wav

Voice Cloning

# Use a reference voice
voice-forge "Hello, world!" --voice reference.wav

# Combine voice cloning with file output
voice-forge "Hello, world!" --voice reference.wav --save cloned_output.wav

Advanced Parameters

# Adjust exaggeration and CFG weight
voice-forge "Hello, world!" --exaggeration 0.7 --cfg-weight 0.3

# Use CPU instead of GPU
voice-forge "Hello, world!" --device cpu

# Save without playing
voice-forge "Hello, world!" --save output.wav --no-play

Audio Playback Options

# Use specific audio backend
voice-forge "Hello, world!" --audio-method pygame
voice-forge "Hello, world!" --audio-method playsound
voice-forge "Hello, world!" --audio-method system

Command Line Options

Option	Short	Description	Default
`text`	-	Text to convert to speech	-
`--file`	`-f`	Read text from file	-
`--voice`	`-v`	Path to reference audio file for voice cloning	-
`--exaggeration`	`-e`	Exaggeration/intensity control (0.0-1.0)	0.5
`--cfg-weight`	`-c`	CFG weight for generation control (0.0-1.0)	0.5
`--save`	`-s`	Save generated audio to file	-
`--no-play`	-	Don't play audio, only save to file	False
`--audio-method`	-	Audio playback method (auto/pygame/playsound/system)	auto
`--device`	-	Device to run model on (auto/cpu/cuda)	auto
`--verbose`	`-V`	Enable verbose output	False

Examples

Basic Text-to-Speech

voice-forge "Welcome to Voice Forge with Chatterbox TTS!"

Gaming Voice Lines

voice-forge "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill."

Expressive Speech

voice-forge "This is amazing!" --exaggeration 0.8 --cfg-weight 0.2

Voice Conversion

voice-forge "Hello, this is my cloned voice!" --voice my_voice_sample.wav

Batch Processing

# Create a text file with your content
echo "This is a longer text that I want to convert to speech." > input.txt
voice-forge --file input.txt --save batch_output.wav

Tips for Best Results

General Use (TTS and Voice Agents)

The default settings (exaggeration=0.5, cfg_weight=0.5) work well for most prompts
If the reference speaker has a fast speaking style, try lowering cfg_weight to around 0.3

Expressive or Dramatic Speech

Use lower cfg_weight values (e.g., ~0.3) and increase exaggeration to around 0.7 or higher
Higher exaggeration tends to speed up speech; reducing cfg_weight helps compensate with slower, more deliberate pacing

Voice Cloning

Use high-quality reference audio (clear speech, minimal background noise)
Reference audio should be at least 3-10 seconds long
WAV format is preferred for reference files

Troubleshooting

Installation Issues

If you encounter import errors:

# Make sure all dependencies are installed
pip install chatterbox-tts torch torchaudio pygame playsound

# On macOS, you might need:
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu

Audio Playback Issues

If audio doesn't play:

# Try different audio methods
voice-forge "test" --audio-method system
voice-forge "test" --audio-method pygame
voice-forge "test" --audio-method playsound

# Or just save to file and play manually
voice-forge "test" --save test.wav --no-play

CUDA Issues

If you have CUDA issues:

# Force CPU mode
voice-forge "test" --device cpu

License

This project is based on Chatterbox TTS by Resemble AI, which is licensed under the MIT License.

Acknowledgments

Resemble AI for creating Chatterbox TTS
Chatterbox TTS Repository
The original Chatterbox research and development team

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Disclaimer

This tool is for educational and research purposes. Please use responsibly and follow all applicable laws and ethics guidelines when generating synthetic speech.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Jun 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voice_forge-1.0.0.tar.gz (9.0 kB view details)

Uploaded Jun 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voice_forge-1.0.0-py3-none-any.whl (9.7 kB view details)

Uploaded Jun 8, 2025 Python 3

File details

Details for the file voice_forge-1.0.0.tar.gz.

File metadata

Download URL: voice_forge-1.0.0.tar.gz
Upload date: Jun 8, 2025
Size: 9.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for voice_forge-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`9c33af43270fd8c80b2add18535af36b8867228f857ef26a417bd00f4c1ccd82`
MD5	`77ed5b910384914050184f8bf7bcb5a1`
BLAKE2b-256	`894f4c73dee29d80fc078e2e73daa16ed48fa4fa1faa7bcb0613e9936eaec62d`

See more details on using hashes here.

File details

Details for the file voice_forge-1.0.0-py3-none-any.whl.

File metadata

Download URL: voice_forge-1.0.0-py3-none-any.whl
Upload date: Jun 8, 2025
Size: 9.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for voice_forge-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`24513d19fbc144ed9323bbc34a801a724e5fe333e458189325c497e4e2f587c4`
MD5	`5cc7b5483b2d741f1f2082ef4b8e7690`
BLAKE2b-256	`62e79ab003b39cafa0b1891f4c532e3b3a36bd6e96764c89710c47b12a61cef6`

See more details on using hashes here.

voice-forge 1.0.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Voice Forge

Features

Installation

Prerequisites

Install Dependencies

Install Voice Forge

Usage

Basic Usage

Voice Cloning

Advanced Parameters

Audio Playback Options

Command Line Options

Examples

Basic Text-to-Speech

Gaming Voice Lines

Expressive Speech

Voice Conversion

Batch Processing

Tips for Best Results

General Use (TTS and Voice Agents)

Expressive or Dramatic Speech

Voice Cloning

Troubleshooting

Installation Issues

Audio Playback Issues

CUDA Issues

License

Acknowledgments

Contributing

Disclaimer

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes