Skip to main content

A unified interface for multiple TTS providers requiring minimal code changes for hotswapping.

Project description

PolyTTS

A unified Python interface for multiple Text-to-Speech providers with seamless hotswapping.

PolyTTS wraps various TTS providers behind a unified API:

  • Hotswapping Providers: Change providers by changing a single line
  • Audio Conversions: Automatic conversion between bytes and numpy arrays
  • Cloud & Local Support: Use cloud APIs or run models locally

All providers return the same AudioData object with consistent conversion methods, so your downstream code stays the same regardless of which TTS you're using.

Installation

# Basic installation
pip install polytts

# With cloud/api providers
pip install polytts[openai]
pip install polytts[elevenlabs]
pip install polytts[fishaudio]

# With local providers
pip install polytts[kokoro]

# With all providers
pip install polytts[all]

# Note: URL dependencies (like GPT-SoVITS) must be installed separately:
pip install git+https://github.com/spava002/GPT-SoVITS-Streaming.git

Development Setup

For contributors and developers:

# Install in editable mode with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

Provider Requirements

Cloud Providers

  • OpenAI: Requires OPENAI_API_KEY environment variable or api_key parameter
  • ElevenLabs: Requires ELEVENLABS_API_KEY environment variable or api_key parameter
  • Fish Audio: Requires FISHAUDIO_API_KEY environment variable or api_key parameter

Local Providers

  • Kokoro:
    • Requires Python 3.9-3.12 (Python 3.13 not yet supported by kokoro)
    • Models download automatically on first use
  • GPT-SoVITS:
    • Original implementation doesn't support an installable package, so must be installed with a custom package
    • Must be installed manually: pip install git+https://github.com/spava002/GPT-SoVITS-Streaming.git
    • Requires a reference audio file (default samples included in package)

Quick Example

import soundfile as sf
from polytts import ElevenLabsTTS, KokoroTTS

# Start with a cloud provider
tts = ElevenLabsTTS(api_key="your-api-key")
audio = tts.run("Hello, world!")
sf.write("output.wav", audio.as_numpy(), audio.sample_rate)

# Switch to a local model - just change one line!
tts = KokoroTTS()
audio = tts.run("Hello, world!")
sf.write("output.wav", audio.as_numpy(), audio.sample_rate)

Supported Providers

Provider Type Streaming Voice Cloning
OpenAI Cloud
ElevenLabs Cloud
Fish Audio Cloud
Kokoro Local
GPT-SoVITS Local

Examples

Check out the examples/ directory for complete working examples:

AudioData API

All providers return an AudioData object that makes format conversion trivial:

audio = tts.run("Hello!")

# Access metadata
print(audio.data)
print(audio.sample_rate)
print(audio.encoded_format)
print(audio.dtype)
print(audio.duration)

# Convert formats
numpy_array = audio.as_numpy("float32")
pcm_bytes = audio.as_bytes("pcm")
wav_bytes = audio.as_bytes("wav")

AudioData supports conversion between:

  • Common byte formats: pcm, wav, mp3
  • Common numpy array dtypes: int16, int32, float16, float32

Contributing

Contributions are welcome! Whether it's:

  • Bug fixes
  • Documentation improvements
  • New TTS providers

To add a new provider, check out the existing implementations in polytts/ Feel free to open an issue or submit a pull request.

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polytts-0.1.0.tar.gz (552.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polytts-0.1.0-py3-none-any.whl (553.5 kB view details)

Uploaded Python 3

File details

Details for the file polytts-0.1.0.tar.gz.

File metadata

  • Download URL: polytts-0.1.0.tar.gz
  • Upload date:
  • Size: 552.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for polytts-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b76dd9b5b1181f49e705f9dab1c811500936c82e70877410a593be96e72be514
MD5 e7473f255db41bf671b73837236a4f99
BLAKE2b-256 7da5fa0f4c3f6ade633d512dd9cb507efa5ebc115f41226b7c79a5f292c05db0

See more details on using hashes here.

File details

Details for the file polytts-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: polytts-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 553.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for polytts-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 790ce7821a721c3b2bff6c4b0de8b732742fa1aa74881d05772b41942006d4e3
MD5 23193a39289ad8cabab68f1d10e57d39
BLAKE2b-256 08ac10cd1eb50663404cdcbdf0c755015a07acabf1d710aecc37f8b7f773ea8a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page