REST API for ChatterboxTTS with OpenAI compatibility

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
Operating System
- OS Independent
Programming Language
- Python :: 3
- Python :: 3.11
Topic
- Multimedia :: Sound/Audio :: Speech
- Scientific/Engineering :: Artificial Intelligence

Project description

Chatterbox TTS API

A Flask-based REST API for ChatterboxTTS, providing OpenAI-compatible text-to-speech endpoints with voice cloning capabilities.

Features

🚀 OpenAI-Compatible API - Drop-in replacement for OpenAI's TTS API
🎭 Voice Cloning - Use your own voice samples for personalized speech
📝 Smart Text Processing - Automatic chunking for long texts
🐳 Docker Ready - Full containerization support
⚙️ Configurable - Extensive environment variable configuration
🎛️ Parameter Control - Real-time adjustment of speech characteristics

Quick Start

1. Local Installation

# Clone the repository
git clone https://github.com/travisvn/chatterbox-tts-api
cd chatterbox-tts-api

# Setup environment — using Python 3.11
python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Copy and customize environment variables
cp .env.example .env

# Add your voice sample (or use the provided one)
# cp your-voice.mp3 voice-sample.mp3

# Start the API
python api.py

2. Docker (Recommended)

# Clone and start with Docker Compose
git clone https://github.com/travisvn/chatterbox-tts-api
cd chatterbox-tts-api
cp .env.example .env  # Customize as needed
docker compose up -d

# If you have nvidia/CUDA, you might have better luck with this
docker compose -f docker-compose.gpu.yml up -d --build

# Watch the logs as it initializes (the first use of TTS takes the longest)
docker logs chatterbox-tts-api -f

# Test the API
curl -X POST http://localhost:5123/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"input": "Hello from ChatterboxTTS!"}' \
  --output test.wav

API Usage

Basic Text-to-Speech

curl -X POST http://localhost:5123/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"input": "Your text here"}' \
  --output speech.wav

With Custom Parameters

curl -X POST http://localhost:5123/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Dramatic speech!",
    "exaggeration": 1.2,
    "cfg_weight": 0.3,
    "temperature": 0.9
  }' \
  --output dramatic.wav

Python Example

import requests

response = requests.post(
    "http://localhost:5123/v1/audio/speech",
    json={
        "input": "Hello world!",
        "exaggeration": 0.8  # More expressive
    }
)

with open("output.wav", "wb") as f:
    f.write(response.content)

Configuration

Key environment variables (see .env.example for full list):

Variable	Default	Description
`PORT`	`5123`	API server port
`EXAGGERATION`	`0.5`	Emotion intensity (0.25-2.0)
`CFG_WEIGHT`	`0.5`	Pace control (0.0-1.0)
`TEMPERATURE`	`0.8`	Sampling randomness (0.05-5.0)
`VOICE_SAMPLE_PATH`	`./voice-sample.mp3`	Voice sample for cloning
`DEVICE`	`auto`	Device (auto/cuda/mps/cpu)

Voice Cloning

Replace the default voice sample:

# Replace the default voice sample
cp your-voice.mp3 voice-sample.mp3

# Or set a custom path
echo "VOICE_SAMPLE_PATH=/path/to/your/voice.mp3" >> .env

For best results:

Use 10-30 seconds of clear speech
Avoid background noise
Prefer WAV or high-quality MP3

Docker Deployment

Development

docker compose up

Production

# Create production environment
cp .env.example .env
nano .env  # Set FLASK_DEBUG=false, etc.

# Deploy
docker compose -f docker-compose.yml up -d

With GPU Support

# Uncomment GPU section in docker-compose.yml
# Ensure NVIDIA Container Toolkit is installed
docker compose up -d

API Endpoints

Endpoint	Method	Description
`/v1/audio/speech`	POST	Generate speech from text
`/health`	GET	Health check and status
`/config`	GET	Current configuration
`/v1/models`	GET	Available models (OpenAI compat)

Parameters Reference

Speech Generation Parameters

Exaggeration (0.25-2.0)

0.3-0.4: Professional, neutral
0.5: Default balanced
0.7-0.8: More expressive
1.0+: Very dramatic

CFG Weight (0.0-1.0)

0.2-0.3: Faster speech
0.5: Default pace
0.7-0.8: Slower, deliberate

Temperature (0.05-5.0)

0.4-0.6: More consistent
0.8: Default balance
1.0+: More creative/random

Testing

Run the test suite:

python test_api.py

Performance

CPU: Works but slower, reduce chunk size for better memory usage
GPU: Recommended for production, significantly faster
Memory: 4GB minimum, 8GB+ recommended
Concurrency: Single request processing for stability

Troubleshooting

Common Issues

CUDA/CPU Compatibility Error

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False

This happens because chatterbox-tts models require PyTorch with CUDA support, even when running on CPU. Solutions:

# Option 1: Use default setup (now includes CUDA-enabled PyTorch (maybe))
docker compose up -d

# Option 2: Use explicit CUDA setup
docker compose -f docker-compose.gpu.yml up -d

# Option 3: Use CPU-only setup (may have compatibility issues)
docker compose -f docker-compose.cpu.yml up -d

# Option 4: Clear model cache and retry with CUDA-enabled setup
docker volume rm chatterbox-tts-api_chatterbox-models
docker compose up -d --build

For local development, install PyTorch with CUDA support:

pip uninstall torch torchvision torchaudio
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
pip install chatterbox-tts

Port conflicts

# Change port
echo "PORT=5002" >> .env

GPU not detected

# Force CPU mode
echo "DEVICE=cpu" >> .env

Out of memory

# Reduce chunk size
echo "MAX_CHUNK_LENGTH=200" >> .env

Model download fails

# Clear cache and retry
rm -rf models/
python api.py

Development

Local Development

# Install in development mode
pip install -e .

# Enable debug mode
export FLASK_DEBUG=true
python api.py

Testing

# Run API tests
python test_api.py

# Test specific endpoint
curl http://localhost:5123/health

License

This API wrapper is provided under the same license terms as the underlying ChatterboxTTS model. See the ChatterboxTTS repository for details.

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

Related Projects

ChatterboxTTS - The core TTS model
Resemble AI - Production TTS services

Support

📖 Documentation: See API_README.md and DOCKER_README.md
🐛 Issues: Report bugs and feature requests via GitHub issues
💬 Discord: Join the ChatterboxTTS Discord or the Discord for this project

Made with ♥️ for the open source community

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
Operating System
- OS Independent
Programming Language
- Python :: 3
- Python :: 3.11
Topic
- Multimedia :: Sound/Audio :: Speech
- Scientific/Engineering :: Artificial Intelligence

Release history Release notifications | RSS feed

This version

1.0.0

May 31, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chatterbox_api-1.0.0.tar.gz (16.8 kB view details)

Uploaded May 31, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

chatterbox_api-1.0.0-py3-none-any.whl (16.8 kB view details)

Uploaded May 31, 2025 Python 3

File details

Details for the file chatterbox_api-1.0.0.tar.gz.

File metadata

Download URL: chatterbox_api-1.0.0.tar.gz
Upload date: May 31, 2025
Size: 16.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for chatterbox_api-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`ccc6812cbe0922e6e9ea6b34bb9931e55d55283f2236e301d16a26598610b228`
MD5	`b45db00d36069198c1933215b08e0ff0`
BLAKE2b-256	`054b0dd5cad6e2723fa5157680041a3d159af5cc5cb02d9f6910c72ba6a33b12`

See more details on using hashes here.

File details

Details for the file chatterbox_api-1.0.0-py3-none-any.whl.

File metadata

Download URL: chatterbox_api-1.0.0-py3-none-any.whl
Upload date: May 31, 2025
Size: 16.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for chatterbox_api-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3302cd01423a0eafeab08681a62f69ca9d4a7616b11ccaa566687d05eaa2f7d5`
MD5	`fe9924904e2eeb8418822086e064f789`
BLAKE2b-256	`6fb26ff71d7ea6d8df121c3ec4442de2df071d5a73be088b6d4f5876d418afe1`

See more details on using hashes here.

chatterbox-api 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Chatterbox TTS API

Features

Quick Start

1. Local Installation

2. Docker (Recommended)

API Usage

Basic Text-to-Speech

With Custom Parameters

Python Example

Configuration

Voice Cloning

Docker Deployment

Development

Production

With GPU Support

API Endpoints

Parameters Reference

Speech Generation Parameters

Testing

Performance

Troubleshooting

Common Issues

Development

Local Development

Testing

License

Contributing

Related Projects

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes