Skip to main content

Official Deepdub AI TTS integration for Pipecat

Project description

Pipecat Deepdub TTS

Official Deepdub AI Text-to-Speech integration for Pipecat -- a framework for building voice and multimodal conversational AI applications.

Note: This integration is maintained by Deepdub AI. As the official provider of the TTS service, we are committed to actively maintaining and updating this integration.

Pipecat Compatibility

Tested with Pipecat v0.0.97

This integration has been tested with Pipecat version 0.0.97. For compatibility with other versions, refer to the Pipecat changelog.

Features

  • Real-time Streaming: WebSocket-based streaming for low-latency audio generation
  • Multiple Models: Support for Deepdub TTS models (dd-etts-2.5, dd-etts-3.0)
  • Voice Customization: Configurable temperature, variance, tempo, and prompt boost
  • Accent Control: Blend accents between locales with fine-grained ratio control
  • Flexible Sample Rates: 8000, 16000, 22050, 24000, 44100, 48000 Hz
  • Interruption Handling: Clean disconnect/reconnect on user interruption
  • Metrics Support: Built-in performance tracking and monitoring

Installation

Using pip

pip install pipecat-deepdub-tts

Using uv

uv add pipecat-deepdub-tts

From source

git clone https://github.com/deepdub-ai/pipecat-deepdub-tts.git
cd pipecat-deepdub-tts
pip install -e .

Quick Start

1. Get Your Deepdub API Key

Sign up at Deepdub AI and obtain your API key.

2. Basic Usage

from pipecat_deepdub_tts import DeepdubTTSService

tts = DeepdubTTSService(
    api_key="your-deepdub-api-key",
    voice_id="your-voice-prompt-id",
    model="dd-etts-2.5",
)

3. Complete Example with Pipeline

import asyncio
import os
from dotenv import load_dotenv
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask, PipelineParams
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat_deepdub_tts import DeepdubTTSService

load_dotenv()

async def main():
    tts = DeepdubTTSService(
        api_key=os.getenv("DEEPDUB_API_KEY"),
        voice_id=os.getenv("DEEPDUB_VOICE_ID"),
        model=os.getenv("DEEPDUB_MODEL", "dd-etts-2.5"),
    )

    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))

    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
    ]
    context = OpenAILLMContext(messages)
    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline([
        llm,
        tts,
        context_aggregator.assistant(),
    ])

    task = PipelineTask(pipeline, params=PipelineParams(enable_metrics=True))
    runner = PipelineRunner()
    await runner.run(task)

if __name__ == "__main__":
    asyncio.run(main())

Configuration

DeepdubTTSService Constructor

Parameter Type Default Description
api_key str required Deepdub API key for authentication
voice_id str required Voice prompt ID for TTS synthesis
model str "dd-etts-2.5" TTS model name
sample_rate int 16000 Audio sample rate in Hz
params InputParams None Optional voice customization parameters

InputParams

Parameter Type Default Description
locale str "en-US" Language locale for synthesis
temperature float None Controls output variability
variance float None Controls variance in generated speech
tempo float None Speech tempo multiplier
prompt_boost bool None Enable prompt boosting for improved quality
accent_base_locale str None Base locale for accent control
accent_locale str None Target accent locale
accent_ratio float None Accent blending ratio (0.0 to 1.0)

Example with Custom Configuration

tts = DeepdubTTSService(
    api_key="your-api-key",
    voice_id="your-voice-prompt-id",
    model="dd-etts-3.0",
    sample_rate=24000,
    params=DeepdubTTSService.InputParams(
        locale="en-US",
        temperature=0.7,
        tempo=1.1,
        prompt_boost=True,
        accent_base_locale="en-US",
        accent_locale="fr-FR",
        accent_ratio=0.3,
    ),
)

Environment Variables

Create a .env file in your project root:

# Deepdub TTS
DEEPDUB_API_KEY=your_deepdub_api_key_here
DEEPDUB_VOICE_ID=your_voice_prompt_id_here
DEEPDUB_MODEL=dd-etts-2.5

# OpenAI (for LLM in examples)
OPENAI_API_KEY=your_openai_api_key_here

# AssemblyAI (for STT in examples)
ASSEMBLYAI_API_KEY=your_assemblyai_api_key_here

Examples

See the examples directory for complete working examples:

To run the example:

# Install dependencies
pip install pipecat-deepdub-tts "pipecat-ai[assemblyai,openai,silero]" python-dotenv

# Set up your .env file with API keys (see .env.example)
# Then run
python examples/foundational/deepdub_tts_basic.py

Testing

Unit tests (no API key needed)

pytest tests/test_tts.py -k "not Integration"

Integration tests (saves audio to disk)

Requires DEEPDUB_API_KEY and DEEPDUB_VOICE_ID environment variables:

# Load .env and run integration tests
pytest tests/test_tts.py -k "Integration" -s

Generated audio files are saved to tests/output/ for manual listening verification.

Requirements

  • Python >= 3.10, < 3.13
  • deepdub >= 0.1.20
  • pipecat-ai >= 0.0.97, < 0.1.0
  • websockets >= 15.0.1, < 16.0
  • loguru >= 0.7.3

Architecture

DeepdubTTSService extends Pipecat's InterruptibleTTSService base class for WebSocket-based TTS without word-level timestamp alignment. It uses the DeepdubClient from the deepdub package for WebSocket protocol communication.

The service:

  1. Opens a WebSocket to Deepdub's streaming endpoint on pipeline start
  2. Sends text chunks as they arrive from the LLM via async_stream_text()
  3. Receives raw PCM audio (s16le format) via a background task
  4. Pushes TTSAudioRawFrame frames into the pipeline
  5. Handles interruptions by disconnecting and reconnecting

License

This project is licensed under the MIT License -- see the LICENSE file for details.

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pipecat_deepdub_tts-0.1.0.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pipecat_deepdub_tts-0.1.0-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file pipecat_deepdub_tts-0.1.0.tar.gz.

File metadata

  • Download URL: pipecat_deepdub_tts-0.1.0.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.18

File hashes

Hashes for pipecat_deepdub_tts-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c2c61bd1b88a703ca60a851f76104c60f92674d5888566644a16ed931803e30c
MD5 17f418e96028162f738bf890aa16c6d1
BLAKE2b-256 db4545d31681a576886d7aea58fb4ccc2b4f0d00a9c24b2229874d5ea89c8379

See more details on using hashes here.

File details

Details for the file pipecat_deepdub_tts-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pipecat_deepdub_tts-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e75652b2f036e7e05362d9638fefe0b4e2463111ba54f0ebb23a7d919950305
MD5 f2b6b4b8462d41ad902218e5753f9118
BLAKE2b-256 3b4c849e4c2ca369636474e938637dc7766e9f3c33fa05c15b33e4803cf256e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page