Skip to main content

A TTS API wrapper server compatible with OpenAI API

Project description

Read this in other languages: English, 中文.

[TOC]

OddTTS - Multi-Engine TTS Voice Synthesis API Wrapper (with OpenAI TTS API compatibility)

OddTTS is a powerful multi-engine text-to-speech service that provides a unified API interface and user-friendly web interface, allowing you to access multiple mainstream TTS engines (including EdgeTTS, ChatTTS, Bert-VITS2, GptSovits, etc.) with a single set of interfaces, and also with OpenAI TTS API compatibility.

I. Preface

1. About OddTTS

I needed TTS functionality for my project XiaoLuo Tongxue (Little Luo Classmate). Due to hardware constraints (an Alibaba Cloud ECS server costing 99 yuan/year), I initially could only use EdgeTTS. However, my personal computer has better specifications, so I tried multiple different TTS engines. I needed to create a unified wrapper for these TTS models so that XiaoLuo Tongxue could switch between different TTS engines at any time - thus OddTTS was born.

Considering the wide range of applications for TTS functionality, I separated it into an independent project and open-sourced it. I hope it helps students with TTS needs.

Note: If you want to use TTS engines other than EdgeTTS, you need to install the corresponding TTS engines yourself before installing and using OddTTS.

2. Why Choose OddTTS?

  • Multi-engine support: Integrates EdgeTTS, ChatTTS, Bert-VITS2, OddGptSovits, and other TTS engines
  • Multiple calling methods: Supports file path return, Base64 encoding return, streaming response, and other output methods
  • User-friendly web interface: Provides a visual operation interface based on Gradio
  • RESTful API: Offers a complete REST API for easy integration into other systems
  • Strong configurability: Supports GPU acceleration, concurrent thread adjustment, model preloading, and other configuration options
  • Cross-platform compatibility: Developed based on Python, supporting Windows, Linux, macOS, and other operating systems

3. Recommended Hardware

Model Name Original Minimum VRAM Original Smooth VRAM Original Full VRAM INT8 Quantized Minimum VRAM INT4 Quantized Minimum VRAM Can Run on Pure CPU CPU Running Speed
EdgeTTS 0GB 0GB 0GB 0GB 0GB ✅ Yes Depends on your network speed
ChatTTS 2.5GB 4GB 6GB+ 1.5GB 1GB ✅ Yes Fast
Bert-VITS2 5GB 6GB 8GB+ 3GB 2GB ✅ Yes Moderate
GPT-SoVITS v2 8GB 10GB 12GB+ 4GB 2.5GB ❌ Not recommended Slow

XiaoLuo Tongxue uses an Alibaba Cloud ECS server costing 99 yuan/year with only 2 cores and 2GB of memory, which can't run any TTS models, so it uses EdgeTTS.

II. Quick Start

1. Install OddTTS

pip install -i https://pypi.org/simple/ oddtts

2. Start OddTTS

1. Default Configuration

Simply execute the following command in the installed virtual environment to start:

oddtts

After starting, OddTTS will bind to 127.0.0.1 (local access only) on port 9001 by default. Access it through your browser at: http://localhost:9001

2. Custom Configuration

To allow access from other IPs, use the following command to start the service, setting host to 0.0.0.0, and you can also change the port to a custom port.

oddtts --host 0.0.0.0 --port 8080

III. OddTTS API Documentation

1. API Interface List

1) OpenAI TTS API Compatibility

GET /v1/audio/speech
  • Function: OpenAI TTS API compatibility, details see OpenAI TTS API.
  • Return: mp3 audio data.

2) Get Voice List

GET /v1/audio/voice/list
  • Function: Get all voices supported by the current TTS engine
  • Return: Voice list, each voice contains name, language, gender, etc.

3) Get Specific Voice Details

GET /v1/audio/voice/list/{voice_name}
  • Function: Get detailed information about a specific voice
  • Parameter: voice_name - Voice name
  • Return: Detailed voice information

4) Generate TTS Audio (Return File Path)

POST /api/oddtts/file
  • Function: Generate TTS audio and return the file path
  • Request Body:
    {
      \"text\": \"Text to be converted to speech\",
      \"voice\": \"Voice name\",
      \"rate\": Speed adjustment (-50 to 50),
      \"volume\": Volume adjustment (-50 to 50),
      \"pitch\": Pitch adjustment (-50 to 50)
    }
    
  • Return: {\"status\": \"success\", \"file_path\": \"Audio file path\", \"format\": \"mp3\"}

5) Generate TTS Audio (Return Base64)

POST /api/oddtts/base64
  • Function: Generate TTS audio and return Base64 encoding
  • Request Body: Same as the file path API
  • Return: {\"status\": \"success\", \"base64\": \"Base64 encoded audio data\", \"format\": \"mp3\"}

6) Generate TTS Audio (Streaming Response)

POST /api/oddtts/stream
  • Function: Generate TTS audio and return it as a streaming response
  • Request Body: Same as the file path API
  • Return: Streaming audio data (audio/mpeg format)

7) Health Check

GET /oddtts/health
  • Function: Check if the service is running normally
  • Return: {\"status\": \"healthy\", \"message\": \"API service is running normally\"}

2. API Call Example

Here's an example of calling the OddTTS API:

1)OpenAI TTS API Compatibility

from openai import OpenAI

base_url = "http://localhost:9001/v1"
model = "oddtts-1"
api_key = "dummy"
voice = "zf_xiaobei"

text = "欢迎关注我的公众号: 奥德元。一起学习AI,一起追赶时代!Good good study, day day up!"

def test_openai_tts_api(voice_id):
    client = OpenAI(
        api_key=api_key,
        base_url=base_url
    )

    response = client.audio.speech.create(
        model=model,
        input=text,
        voice=voice_id,
        response_format="mp3"
    )
    response.write_to_file("output.mp3")

if __name__ == "__main__":
    test_openai_tts_api(voice)

2) API Call Example

import requests

# Configure API base URL
API_BASE_URL = "http://localhost:9001"

# Test text
TEST_TEXT = \"Hello! This is an API test. 这是一个API测试。\"

# Get voice list
def test_api_voices():
    response = requests.get(f\"{API_BASE_URL}/v1/audio/voice/list\")
    voices = response.json()
    print(f\"Successfully obtained {len(voices)} voice options\")
    return voices

# Test generating TTS audio
def test_api_tts_file(voice_name):
    payload = {
        \"text\": TEST_TEXT,
        \"voice\": voice_name,
        \"rate\": 0,
        \"volume\": 0,
        \"pitch\": 0
    }
    response = requests.post(f\"{API_BASE_URL}/api/oddtts/file\", json=payload)
    result = response.json()
    print(f\"Audio file path: {result.get('file_path')}\")

IV. Web Interface Usage

After starting the service, you can access http://localhost:9001/ through your browser to open the Gradio Web interface, which supports the following functions:

  • Text input area: Enter text to be converted to speech
  • Voice selection: Choose different voices and languages
  • Parameter adjustment: Adjust speed, volume, pitch, and other parameters
  • Audio generation: Click the button to generate and play speech
  • Audio download: Download the generated speech file

V. Common Issues

  1. Service startup failure

    • Check if the port is occupied
    • Confirm all dependency packages are correctly installed
    • View the log file for detailed error information
  2. Speech synthesis failure

    • Check if the TTS engine configuration is correct
    • Confirm that the selected voice exists in the current TTS engine
    • For engines that require internet access, confirm that the network connection is normal
  3. How to switch TTS engines

    • Modify the tts_type configuration item in the oddtts_config.py file
    • Restart the service for the configuration to take effect
  4. Output format

    • Default output format: mp3
    • You can specify other format such as wav, mp3 by setting response_format parameter

VI. License

The OddTTS project has no license. Feel free to copy without any conditions! Just code happily! Contributions and improvement suggestions are also welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oddtts-1.2.5.tar.gz (895.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oddtts-1.2.5-py3-none-any.whl (904.1 kB view details)

Uploaded Python 3

File details

Details for the file oddtts-1.2.5.tar.gz.

File metadata

  • Download URL: oddtts-1.2.5.tar.gz
  • Upload date:
  • Size: 895.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for oddtts-1.2.5.tar.gz
Algorithm Hash digest
SHA256 02b7c622edf215dc4f6d6da5e9966057c040ac7b907f7449ba1c3d571269c9df
MD5 f9522b087969b73dbe44d8bd2f817c4f
BLAKE2b-256 ec5e9bd38dd07e22bb58f0cec1929b02a1b867f94ec36f8f7d2fb30a3af24ab0

See more details on using hashes here.

File details

Details for the file oddtts-1.2.5-py3-none-any.whl.

File metadata

  • Download URL: oddtts-1.2.5-py3-none-any.whl
  • Upload date:
  • Size: 904.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for oddtts-1.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 29ba7484457fd3466f06c67a6b969541a63f77b1d132b1c183d106a40dd94021
MD5 f4dca2f1fb73841fcbee01062f03c55d
BLAKE2b-256 7accf3fd48715b02fde8aa9f337cd906b749fee2e1d8e38ff0d963fee5495198

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page