Skip to main content

LiveKit plugin for knowlithic TTS using Knowlithic AI

Project description

LiveKit TTS Integration

A LiveKit agent implementation with custom Text-to-Speech (TTS) using Knowlithic AI's WebSocket API.

Overview

This project demonstrates how to integrate a custom TTS service with LiveKit agents for real-time voice conversations. The TTS implementation uses a persistent WebSocket connection to Knowlithic AI for low-latency audio synthesis.

Features

  • Real-time streaming TTS synthesis via WebSocket
  • Persistent connection management for optimal performance
  • Integration with LiveKit Agents framework
  • Configurable voice and sample rate
  • Automatic connection recovery and error handling

Quick Start

1. Environment Setup

Create a .env file with your configuration:

# Knowlithic TTS WebSocket URL
KNOWLITHIC_TTS_SERVER=ws://localhost:8000

# System prompt for the agent
SYSTEM_PROMPT="You are Avery, a warm, empathetic AI nurse..."

# Optional: Override default TTS settings
TTS_SERVER_URL=ws://your-tts-server:8000

2. Install Dependencies

pip install -r requirements.txt

3. Run the Agent

python agent.py download-files
python agent.py

TTS Integration Pattern

The TTS is integrated into the LiveKit agent session as follows:

from tts_custom.knowlithictts import TTS

# Initialize TTS with persistent connection
custom_tts = TTS()

# Create agent session with TTS
session = AgentSession(
    stt=deepgram.STT(model="nova-3", language="multi"),
    llm=groq.LLM(model="llama3-8b-8192"),
    tts=custom_tts,  # Custom TTS integration
    vad=silero.VAD.load(
        min_silence_duration=3.0,
        min_speech_duration=1.0,
        activation_threshold=20.0,
    ),
)

# Start the session
await session.start(
    room=ctx.room,
    agent=Assistant(),
    room_input_options=RoomInputOptions(
        noise_cancellation=noise_cancellation.BVC(),
    ),
)

# Clean up TTS connection
finally:
    await custom_tts.aclose()

TTS Configuration

Basic Configuration

from tts_custom.knowlithictts import TTS

# Default configuration
tts = TTS()  # Uses "elise" voice, 16kHz sample rate

# Custom configuration
tts = TTS(
    voice="elise",
    sample_rate=16000,
    use_streaming=True
)

Environment Variables

Variable Description Default
KNOWLITHIC_TTS_SERVER WebSocket URL for TTS service Required
TTS_SERVER_URL Alternative TTS server URL Optional

TTS Options

  • voice: Voice identifier (default: "elise")
  • sample_rate: Audio sample rate in Hz (default: 16000)
  • use_streaming: Enable streaming mode (default: True)

Architecture

Connection Management

The TTS implementation uses a persistent WebSocket connection:

class TTS(tts.TTS):
    def __init__(self, ...):
        self._ws_connection = None
        self._ws_lock = asyncio.Lock()
    
    async def _get_ws_connection(self, timeout: float):
        # Get or create persistent connection
        async with self._ws_lock:
            if self._ws_connection is None or self._ws_connection.closed:
                self._ws_connection = await self._ensure_session().ws_connect(
                    KNOWLITHIC_WEBSOCKET_URL
                )
            return self._ws_connection

Streaming Synthesis

Text is processed in real-time through the WebSocket connection:

async def _run_ws(self, input_stream, output_emitter):
    # Send text chunks to TTS service
    payload = {
        "type": "text_to_speech",
        "text": data.token,
        "voice": self._opts.voice,
        "request_id": last_index,
        "sample_rate": self._opts.sample_rate,
        "output_format": "wav",
    }
    await ws.send_str(json.dumps(payload))
    
    # Receive and emit audio data
    if data.get("type") == "audio":
        b64data = base64.b64decode(data["audio_data"])
        output_emitter.push(b64data)

Performance Optimization

Connection Prewarming

# Prewarm the TTS connection for faster first response
tts.prewarm()

Error Handling

The implementation includes automatic connection recovery:

async def _ensure_ws_connection(self, timeout: float):
    try:
        ws = await self._get_ws_connection(timeout)
        if ws.closed:
            async with self._ws_lock:
                self._ws_connection = None
            ws = await self._get_ws_connection(timeout)
        return ws
    except Exception as e:
        # Reset connection on error
        async with self._ws_lock:
            self._ws_connection = None
        raise

Requirements

  • Python 3.9+
  • LiveKit Agents 1.2.2+
  • aiohttp 3.8.0+
  • python-dotenv 1.0.0+
  • Knowlithic TTS WebSocket service

Development

Running Locally

  1. Start your Knowlithic TTS WebSocket service
  2. Set environment variables
  3. Run the agent: python agent.py

Testing TTS

import asyncio
from tts_custom.knowlithictts import TTS

async def test_tts():
    tts = TTS()
    stream = tts.stream()
    
    await stream.push_text("Hello, this is a test.")
    await stream.flush()
    await stream.aclose()
    
    await tts.aclose()

asyncio.run(test_tts())

Troubleshooting

Common Issues

  1. Connection Timeout: Ensure KNOWLITHIC_TTS_SERVER is accessible
  2. Audio Quality: Verify sample rate matches TTS service (16000 Hz)
  3. Latency: Use connection prewarming for faster first response

Logging

Enable debug logging to troubleshoot TTS issues:

import logging
logging.basicConfig(level=logging.DEBUG)

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livekit_plugins_knowlithic-0.1.3.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livekit_plugins_knowlithic-0.1.3-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file livekit_plugins_knowlithic-0.1.3.tar.gz.

File metadata

File hashes

Hashes for livekit_plugins_knowlithic-0.1.3.tar.gz
Algorithm Hash digest
SHA256 328e989d6c5836ba171aac4088185760227b81daf8b3e67abeb20487303918fa
MD5 4fcab812c00f8e7c71e1959b1ca03e4f
BLAKE2b-256 f88708ce083499872b7483c0cbc8cb9ed0e5346e66164faa22f80c28b33eb1b8

See more details on using hashes here.

File details

Details for the file livekit_plugins_knowlithic-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for livekit_plugins_knowlithic-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 807ed3f17d7bcfd31d10d3687067a5e308df6bd1d2cbe745bc420985fa215602
MD5 a25fd1dbb8bc77f69c524f091e48d5c5
BLAKE2b-256 c0e4241fec383f09a0fa3e459426c5120aacc8caa8fcc3d8a51658693492e19d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page