Skip to main content

REST API for ChatterboxTTS with OpenAI compatibility

Project description

Chatterbox TTS API

A Flask-based REST API for ChatterboxTTS, providing OpenAI-compatible text-to-speech endpoints with voice cloning capabilities.

Features

🚀 OpenAI-Compatible API - Drop-in replacement for OpenAI's TTS API
🎭 Voice Cloning - Use your own voice samples for personalized speech
📝 Smart Text Processing - Automatic chunking for long texts
🐳 Docker Ready - Full containerization support
⚙️ Configurable - Extensive environment variable configuration
🎛️ Parameter Control - Real-time adjustment of speech characteristics

Quick Start

1. Local Installation

# Clone the repository
git clone https://github.com/travisvn/chatterbox-tts-api
cd chatterbox-tts-api

# Setup environment — using Python 3.11
python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Copy and customize environment variables
cp .env.example .env

# Add your voice sample (or use the provided one)
# cp your-voice.mp3 voice-sample.mp3

# Start the API
python api.py

2. Docker (Recommended)

# Clone and start with Docker Compose
git clone https://github.com/travisvn/chatterbox-tts-api
cd chatterbox-tts-api
cp .env.example .env  # Customize as needed
docker compose up -d

# If you have nvidia/CUDA, you might have better luck with this
docker compose -f docker-compose.gpu.yml up -d --build

# Watch the logs as it initializes (the first use of TTS takes the longest)
docker logs chatterbox-tts-api -f

# Test the API
curl -X POST http://localhost:5123/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"input": "Hello from ChatterboxTTS!"}' \
  --output test.wav

API Usage

Basic Text-to-Speech

curl -X POST http://localhost:5123/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"input": "Your text here"}' \
  --output speech.wav

With Custom Parameters

curl -X POST http://localhost:5123/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Dramatic speech!",
    "exaggeration": 1.2,
    "cfg_weight": 0.3,
    "temperature": 0.9
  }' \
  --output dramatic.wav

Python Example

import requests

response = requests.post(
    "http://localhost:5123/v1/audio/speech",
    json={
        "input": "Hello world!",
        "exaggeration": 0.8  # More expressive
    }
)

with open("output.wav", "wb") as f:
    f.write(response.content)

Configuration

Key environment variables (see .env.example for full list):

Variable Default Description
PORT 5123 API server port
EXAGGERATION 0.5 Emotion intensity (0.25-2.0)
CFG_WEIGHT 0.5 Pace control (0.0-1.0)
TEMPERATURE 0.8 Sampling randomness (0.05-5.0)
VOICE_SAMPLE_PATH ./voice-sample.mp3 Voice sample for cloning
DEVICE auto Device (auto/cuda/mps/cpu)

Voice Cloning

Replace the default voice sample:

# Replace the default voice sample
cp your-voice.mp3 voice-sample.mp3

# Or set a custom path
echo "VOICE_SAMPLE_PATH=/path/to/your/voice.mp3" >> .env

For best results:

  • Use 10-30 seconds of clear speech
  • Avoid background noise
  • Prefer WAV or high-quality MP3

Docker Deployment

Development

docker compose up

Production

# Create production environment
cp .env.example .env
nano .env  # Set FLASK_DEBUG=false, etc.

# Deploy
docker compose -f docker-compose.yml up -d

With GPU Support

# Uncomment GPU section in docker-compose.yml
# Ensure NVIDIA Container Toolkit is installed
docker compose up -d

API Endpoints

Endpoint Method Description
/v1/audio/speech POST Generate speech from text
/health GET Health check and status
/config GET Current configuration
/v1/models GET Available models (OpenAI compat)

Parameters Reference

Speech Generation Parameters

Exaggeration (0.25-2.0)

  • 0.3-0.4: Professional, neutral
  • 0.5: Default balanced
  • 0.7-0.8: More expressive
  • 1.0+: Very dramatic

CFG Weight (0.0-1.0)

  • 0.2-0.3: Faster speech
  • 0.5: Default pace
  • 0.7-0.8: Slower, deliberate

Temperature (0.05-5.0)

  • 0.4-0.6: More consistent
  • 0.8: Default balance
  • 1.0+: More creative/random

Testing

Run the test suite:

python test_api.py

Performance

  • CPU: Works but slower, reduce chunk size for better memory usage
  • GPU: Recommended for production, significantly faster
  • Memory: 4GB minimum, 8GB+ recommended
  • Concurrency: Single request processing for stability

Troubleshooting

Common Issues

CUDA/CPU Compatibility Error

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False

This happens because chatterbox-tts models require PyTorch with CUDA support, even when running on CPU. Solutions:

# Option 1: Use default setup (now includes CUDA-enabled PyTorch (maybe))
docker compose up -d

# Option 2: Use explicit CUDA setup
docker compose -f docker-compose.gpu.yml up -d

# Option 3: Use CPU-only setup (may have compatibility issues)
docker compose -f docker-compose.cpu.yml up -d

# Option 4: Clear model cache and retry with CUDA-enabled setup
docker volume rm chatterbox-tts-api_chatterbox-models
docker compose up -d --build

For local development, install PyTorch with CUDA support:

pip uninstall torch torchvision torchaudio
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
pip install chatterbox-tts

Port conflicts

# Change port
echo "PORT=5002" >> .env

GPU not detected

# Force CPU mode
echo "DEVICE=cpu" >> .env

Out of memory

# Reduce chunk size
echo "MAX_CHUNK_LENGTH=200" >> .env

Model download fails

# Clear cache and retry
rm -rf models/
python api.py

Development

Local Development

# Install in development mode
pip install -e .

# Enable debug mode
export FLASK_DEBUG=true
python api.py

Testing

# Run API tests
python test_api.py

# Test specific endpoint
curl http://localhost:5123/health

License

This API wrapper is provided under the same license terms as the underlying ChatterboxTTS model. See the ChatterboxTTS repository for details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

Related Projects

Support


Made with ♥️ for the open source community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chatterbox_api-1.0.0.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chatterbox_api-1.0.0-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file chatterbox_api-1.0.0.tar.gz.

File metadata

  • Download URL: chatterbox_api-1.0.0.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for chatterbox_api-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ccc6812cbe0922e6e9ea6b34bb9931e55d55283f2236e301d16a26598610b228
MD5 b45db00d36069198c1933215b08e0ff0
BLAKE2b-256 054b0dd5cad6e2723fa5157680041a3d159af5cc5cb02d9f6910c72ba6a33b12

See more details on using hashes here.

File details

Details for the file chatterbox_api-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: chatterbox_api-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for chatterbox_api-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3302cd01423a0eafeab08681a62f69ca9d4a7616b11ccaa566687d05eaa2f7d5
MD5 fe9924904e2eeb8418822086e064f789
BLAKE2b-256 6fb26ff71d7ea6d8df121c3ec4442de2df071d5a73be088b6d4f5876d418afe1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page