Skip to main content

LiveKit plugin for knowlithic TTS using Knowlithic AI

Project description

LiveKit TTS Integration

A LiveKit agent implementation with custom Text-to-Speech (TTS) using Knowlithic AI's WebSocket API.

Overview

This project demonstrates how to integrate a custom TTS service with LiveKit agents for real-time voice conversations. The TTS implementation uses a persistent WebSocket connection to Knowlithic AI for low-latency audio synthesis.

Features

  • Real-time streaming TTS synthesis via WebSocket
  • Persistent connection management for optimal performance
  • Integration with LiveKit Agents framework
  • Configurable voice and sample rate
  • Automatic connection recovery and error handling

Quick Start

1. Environment Setup

Create a .env file with your configuration:

# Knowlithic TTS WebSocket URL
KNOWLITHIC_TTS_SERVER=ws://localhost:8000

# System prompt for the agent
SYSTEM_PROMPT="You are Avery, a warm, empathetic AI nurse..."

# Optional: Override default TTS settings
TTS_SERVER_URL=ws://your-tts-server:8000

2. Install Dependencies

pip install -r requirements.txt

3. Run the Agent

python agent.py download-files
python agent.py

TTS Integration Pattern

The TTS is integrated into the LiveKit agent session as follows:

from tts_custom.knowlithictts import TTS

# Initialize TTS with persistent connection
custom_tts = TTS()

# Create agent session with TTS
session = AgentSession(
    stt=deepgram.STT(model="nova-3", language="multi"),
    llm=groq.LLM(model="llama3-8b-8192"),
    tts=custom_tts,  # Custom TTS integration
    vad=silero.VAD.load(
        min_silence_duration=3.0,
        min_speech_duration=1.0,
        activation_threshold=20.0,
    ),
)

# Start the session
await session.start(
    room=ctx.room,
    agent=Assistant(),
    room_input_options=RoomInputOptions(
        noise_cancellation=noise_cancellation.BVC(),
    ),
)

# Clean up TTS connection
finally:
    await custom_tts.aclose()

TTS Configuration

Basic Configuration

from tts_custom.knowlithictts import TTS

# Default configuration
tts = TTS()  # Uses "elise" voice, 16kHz sample rate

# Custom configuration
tts = TTS(
    voice="elise",
    sample_rate=16000,
    use_streaming=True
)

Environment Variables

Variable Description Default
KNOWLITHIC_TTS_SERVER WebSocket URL for TTS service Required
TTS_SERVER_URL Alternative TTS server URL Optional

TTS Options

  • voice: Voice identifier (default: "elise")
  • sample_rate: Audio sample rate in Hz (default: 16000)
  • use_streaming: Enable streaming mode (default: True)

Architecture

Connection Management

The TTS implementation uses a persistent WebSocket connection:

class TTS(tts.TTS):
    def __init__(self, ...):
        self._ws_connection = None
        self._ws_lock = asyncio.Lock()
    
    async def _get_ws_connection(self, timeout: float):
        # Get or create persistent connection
        async with self._ws_lock:
            if self._ws_connection is None or self._ws_connection.closed:
                self._ws_connection = await self._ensure_session().ws_connect(
                    KNOWLITHIC_WEBSOCKET_URL
                )
            return self._ws_connection

Streaming Synthesis

Text is processed in real-time through the WebSocket connection:

async def _run_ws(self, input_stream, output_emitter):
    # Send text chunks to TTS service
    payload = {
        "type": "text_to_speech",
        "text": data.token,
        "voice": self._opts.voice,
        "request_id": last_index,
        "sample_rate": self._opts.sample_rate,
        "output_format": "wav",
    }
    await ws.send_str(json.dumps(payload))
    
    # Receive and emit audio data
    if data.get("type") == "audio":
        b64data = base64.b64decode(data["audio_data"])
        output_emitter.push(b64data)

Performance Optimization

Connection Prewarming

# Prewarm the TTS connection for faster first response
tts.prewarm()

Error Handling

The implementation includes automatic connection recovery:

async def _ensure_ws_connection(self, timeout: float):
    try:
        ws = await self._get_ws_connection(timeout)
        if ws.closed:
            async with self._ws_lock:
                self._ws_connection = None
            ws = await self._get_ws_connection(timeout)
        return ws
    except Exception as e:
        # Reset connection on error
        async with self._ws_lock:
            self._ws_connection = None
        raise

Requirements

  • Python 3.9+
  • LiveKit Agents 1.2.2+
  • aiohttp 3.8.0+
  • python-dotenv 1.0.0+
  • Knowlithic TTS WebSocket service

Development

Running Locally

  1. Start your Knowlithic TTS WebSocket service
  2. Set environment variables
  3. Run the agent: python agent.py

Testing TTS

import asyncio
from tts_custom.knowlithictts import TTS

async def test_tts():
    tts = TTS()
    stream = tts.stream()
    
    await stream.push_text("Hello, this is a test.")
    await stream.flush()
    await stream.aclose()
    
    await tts.aclose()

asyncio.run(test_tts())

Troubleshooting

Common Issues

  1. Connection Timeout: Ensure KNOWLITHIC_TTS_SERVER is accessible
  2. Audio Quality: Verify sample rate matches TTS service (16000 Hz)
  3. Latency: Use connection prewarming for faster first response

Logging

Enable debug logging to troubleshoot TTS issues:

import logging
logging.basicConfig(level=logging.DEBUG)

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livekit_plugins_knowlithic-0.1.1.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livekit_plugins_knowlithic-0.1.1-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file livekit_plugins_knowlithic-0.1.1.tar.gz.

File metadata

File hashes

Hashes for livekit_plugins_knowlithic-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6aef3bb89c2d7e3b78caada5ee8f09f983c224f6e05292751f1fa84746c5de39
MD5 3be81e29611513c1dbb1cb52cc532fde
BLAKE2b-256 9025e2f465116f4594e321d39941440b7fa5730f090b5b07487930871198890f

See more details on using hashes here.

File details

Details for the file livekit_plugins_knowlithic-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for livekit_plugins_knowlithic-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ce772399f2029b62a206e0c7839ef0cc115a4851e6ec16867159115a280074af
MD5 ab7ba36a87f7aa3377530016a5b4c101
BLAKE2b-256 729de0beeafcc813dabb5c8dfb60eb64743a7853c59b0bf8038e0cfc9e130103

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page