Official Deepdub AI TTS integration for Pipecat
Project description
Pipecat Deepdub TTS
Official Deepdub AI Text-to-Speech integration for Pipecat -- a framework for building voice and multimodal conversational AI applications.
Note: This integration is maintained by Deepdub AI. As the official provider of the TTS service, we are committed to actively maintaining and updating this integration.
Pipecat Compatibility
Tested with Pipecat v0.0.97
This integration has been tested with Pipecat version 0.0.97. For compatibility with other versions, refer to the Pipecat changelog.
Features
- Real-time Streaming: WebSocket-based streaming for low-latency audio generation
- Multiple Models: Support for Deepdub TTS models (
dd-etts-2.5,dd-etts-3.0) - Voice Customization: Configurable temperature, variance, tempo, and prompt boost
- Accent Control: Blend accents between locales with fine-grained ratio control
- Flexible Sample Rates: 8000, 16000, 22050, 24000, 44100, 48000 Hz
- Interruption Handling: Clean disconnect/reconnect on user interruption
- Metrics Support: Built-in performance tracking and monitoring
Installation
Using pip
pip install pipecat-deepdub-tts
Using uv
uv add pipecat-deepdub-tts
From source
git clone https://github.com/deepdub-ai/pipecat-deepdub-tts.git
cd pipecat-deepdub-tts
pip install -e .
Quick Start
1. Get Your Deepdub API Key
Sign up at Deepdub AI and obtain your API key.
2. Basic Usage
from pipecat_deepdub_tts import DeepdubTTSService
tts = DeepdubTTSService(
api_key="your-deepdub-api-key",
voice_id="your-voice-prompt-id",
model="dd-etts-2.5",
)
3. Complete Example with Pipeline
import asyncio
import os
from dotenv import load_dotenv
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask, PipelineParams
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat_deepdub_tts import DeepdubTTSService
load_dotenv()
async def main():
tts = DeepdubTTSService(
api_key=os.getenv("DEEPDUB_API_KEY"),
voice_id=os.getenv("DEEPDUB_VOICE_ID"),
model=os.getenv("DEEPDUB_MODEL", "dd-etts-2.5"),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{"role": "system", "content": "You are a helpful assistant."},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline([
llm,
tts,
context_aggregator.assistant(),
])
task = PipelineTask(pipeline, params=PipelineParams(enable_metrics=True))
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())
Configuration
DeepdubTTSService Constructor
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key |
str |
required | Deepdub API key for authentication |
voice_id |
str |
required | Voice prompt ID for TTS synthesis |
model |
str |
"dd-etts-2.5" |
TTS model name |
sample_rate |
int |
16000 |
Audio sample rate in Hz |
params |
InputParams |
None |
Optional voice customization parameters |
InputParams
| Parameter | Type | Default | Description |
|---|---|---|---|
locale |
str |
"en-US" |
Language locale for synthesis |
temperature |
float |
None |
Controls output variability |
variance |
float |
None |
Controls variance in generated speech |
tempo |
float |
None |
Speech tempo multiplier |
prompt_boost |
bool |
None |
Enable prompt boosting for improved quality |
accent_base_locale |
str |
None |
Base locale for accent control |
accent_locale |
str |
None |
Target accent locale |
accent_ratio |
float |
None |
Accent blending ratio (0.0 to 1.0) |
Example with Custom Configuration
tts = DeepdubTTSService(
api_key="your-api-key",
voice_id="your-voice-prompt-id",
model="dd-etts-3.0",
sample_rate=24000,
params=DeepdubTTSService.InputParams(
locale="en-US",
temperature=0.7,
tempo=1.1,
prompt_boost=True,
accent_base_locale="en-US",
accent_locale="fr-FR",
accent_ratio=0.3,
),
)
Environment Variables
Create a .env file in your project root:
# Deepdub TTS
DEEPDUB_API_KEY=your_deepdub_api_key_here
DEEPDUB_VOICE_ID=your_voice_prompt_id_here
DEEPDUB_MODEL=dd-etts-2.5
# OpenAI (for LLM in examples)
OPENAI_API_KEY=your_openai_api_key_here
# AssemblyAI (for STT in examples)
ASSEMBLYAI_API_KEY=your_assemblyai_api_key_here
Examples
See the examples directory for complete working examples:
- deepdub_tts_basic.py -- Full pipeline with AssemblyAI STT, OpenAI LLM, and Deepdub TTS
To run the example:
# Install dependencies
pip install pipecat-deepdub-tts "pipecat-ai[assemblyai,openai,silero]" python-dotenv
# Set up your .env file with API keys (see .env.example)
# Then run
python examples/foundational/deepdub_tts_basic.py
Testing
Unit tests (no API key needed)
pytest tests/test_tts.py -k "not Integration"
Integration tests (saves audio to disk)
Requires DEEPDUB_API_KEY and DEEPDUB_VOICE_ID environment variables:
# Load .env and run integration tests
pytest tests/test_tts.py -k "Integration" -s
Generated audio files are saved to tests/output/ for manual listening verification.
Requirements
- Python >= 3.10, < 3.13
- deepdub >= 0.1.20
- pipecat-ai >= 0.0.97, < 0.1.0
- websockets >= 15.0.1, < 16.0
- loguru >= 0.7.3
Architecture
DeepdubTTSService extends Pipecat's InterruptibleTTSService base class for WebSocket-based TTS without word-level timestamp alignment. It uses the DeepdubClient from the deepdub package for WebSocket protocol communication.
The service:
- Opens a WebSocket to Deepdub's streaming endpoint on pipeline start
- Sends text chunks as they arrive from the LLM via
async_stream_text() - Receives raw PCM audio (s16le format) via a background task
- Pushes
TTSAudioRawFrameframes into the pipeline - Handles interruptions by disconnecting and reconnecting
License
This project is licensed under the MIT License -- see the LICENSE file for details.
Support
- GitHub Issues: pipecat-deepdub-tts/issues
- Deepdub: deepdub.ai
- Pipecat Discord: discord.gg/pipecat
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pipecat_deepdub_tts-0.1.0.tar.gz.
File metadata
- Download URL: pipecat_deepdub_tts-0.1.0.tar.gz
- Upload date:
- Size: 19.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2c61bd1b88a703ca60a851f76104c60f92674d5888566644a16ed931803e30c
|
|
| MD5 |
17f418e96028162f738bf890aa16c6d1
|
|
| BLAKE2b-256 |
db4545d31681a576886d7aea58fb4ccc2b4f0d00a9c24b2229874d5ea89c8379
|
File details
Details for the file pipecat_deepdub_tts-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pipecat_deepdub_tts-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e75652b2f036e7e05362d9638fefe0b4e2463111ba54f0ebb23a7d919950305
|
|
| MD5 |
f2b6b4b8462d41ad902218e5753f9118
|
|
| BLAKE2b-256 |
3b4c849e4c2ca369636474e938637dc7766e9f3c33fa05c15b33e4803cf256e1
|