Skip to main content

LiveKit plugin for knowlithic TTS using Knowlithic AI

Project description

LiveKit TTS Integration

A LiveKit agent implementation with custom Text-to-Speech (TTS) using Knowlithic AI's WebSocket API.

Overview

This project demonstrates how to integrate a custom TTS service with LiveKit agents for real-time voice conversations. The TTS implementation uses a persistent WebSocket connection to Knowlithic AI for low-latency audio synthesis.

Features

  • Real-time streaming TTS synthesis via WebSocket
  • Persistent connection management for optimal performance
  • Integration with LiveKit Agents framework
  • Configurable voice and sample rate
  • Automatic connection recovery and error handling

Quick Start

1. Environment Setup

Create a .env file with your configuration:

# Knowlithic TTS WebSocket URL
KNOWLITHIC_TTS_SERVER=ws://localhost:8000

# System prompt for the agent
SYSTEM_PROMPT="You are Avery, a warm, empathetic AI nurse..."

# Optional: Override default TTS settings
TTS_SERVER_URL=ws://your-tts-server:8000

2. Install Dependencies

pip install -r requirements.txt

3. Run the Agent

python agent.py download-files
python agent.py

TTS Integration Pattern

The TTS is integrated into the LiveKit agent session as follows:

from tts_custom.knowlithictts import TTS

# Initialize TTS with persistent connection
custom_tts = TTS()

# Create agent session with TTS
session = AgentSession(
    stt=deepgram.STT(model="nova-3", language="multi"),
    llm=groq.LLM(model="llama3-8b-8192"),
    tts=custom_tts,  # Custom TTS integration
    vad=silero.VAD.load(
        min_silence_duration=3.0,
        min_speech_duration=1.0,
        activation_threshold=20.0,
    ),
)

# Start the session
await session.start(
    room=ctx.room,
    agent=Assistant(),
    room_input_options=RoomInputOptions(
        noise_cancellation=noise_cancellation.BVC(),
    ),
)

# Clean up TTS connection
finally:
    await custom_tts.aclose()

TTS Configuration

Basic Configuration

from tts_custom.knowlithictts import TTS

# Default configuration
tts = TTS()  # Uses "elise" voice, 16kHz sample rate

# Custom configuration
tts = TTS(
    voice="elise",
    sample_rate=16000,
    use_streaming=True
)

Environment Variables

Variable Description Default
KNOWLITHIC_TTS_SERVER WebSocket URL for TTS service Required
TTS_SERVER_URL Alternative TTS server URL Optional

TTS Options

  • voice: Voice identifier (default: "elise")
  • sample_rate: Audio sample rate in Hz (default: 16000)
  • use_streaming: Enable streaming mode (default: True)

Architecture

Connection Management

The TTS implementation uses a persistent WebSocket connection:

class TTS(tts.TTS):
    def __init__(self, ...):
        self._ws_connection = None
        self._ws_lock = asyncio.Lock()
    
    async def _get_ws_connection(self, timeout: float):
        # Get or create persistent connection
        async with self._ws_lock:
            if self._ws_connection is None or self._ws_connection.closed:
                self._ws_connection = await self._ensure_session().ws_connect(
                    KNOWLITHIC_WEBSOCKET_URL
                )
            return self._ws_connection

Streaming Synthesis

Text is processed in real-time through the WebSocket connection:

async def _run_ws(self, input_stream, output_emitter):
    # Send text chunks to TTS service
    payload = {
        "type": "text_to_speech",
        "text": data.token,
        "voice": self._opts.voice,
        "request_id": last_index,
        "sample_rate": self._opts.sample_rate,
        "output_format": "wav",
    }
    await ws.send_str(json.dumps(payload))
    
    # Receive and emit audio data
    if data.get("type") == "audio":
        b64data = base64.b64decode(data["audio_data"])
        output_emitter.push(b64data)

Performance Optimization

Connection Prewarming

# Prewarm the TTS connection for faster first response
tts.prewarm()

Error Handling

The implementation includes automatic connection recovery:

async def _ensure_ws_connection(self, timeout: float):
    try:
        ws = await self._get_ws_connection(timeout)
        if ws.closed:
            async with self._ws_lock:
                self._ws_connection = None
            ws = await self._get_ws_connection(timeout)
        return ws
    except Exception as e:
        # Reset connection on error
        async with self._ws_lock:
            self._ws_connection = None
        raise

Requirements

  • Python 3.9+
  • LiveKit Agents 1.2.2+
  • aiohttp 3.8.0+
  • python-dotenv 1.0.0+
  • Knowlithic TTS WebSocket service

Development

Running Locally

  1. Start your Knowlithic TTS WebSocket service
  2. Set environment variables
  3. Run the agent: python agent.py

Testing TTS

import asyncio
from tts_custom.knowlithictts import TTS

async def test_tts():
    tts = TTS()
    stream = tts.stream()
    
    await stream.push_text("Hello, this is a test.")
    await stream.flush()
    await stream.aclose()
    
    await tts.aclose()

asyncio.run(test_tts())

Troubleshooting

Common Issues

  1. Connection Timeout: Ensure KNOWLITHIC_TTS_SERVER is accessible
  2. Audio Quality: Verify sample rate matches TTS service (16000 Hz)
  3. Latency: Use connection prewarming for faster first response

Logging

Enable debug logging to troubleshoot TTS issues:

import logging
logging.basicConfig(level=logging.DEBUG)

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livekit_plugins_knowlithic-0.1.0.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livekit_plugins_knowlithic-0.1.0-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file livekit_plugins_knowlithic-0.1.0.tar.gz.

File metadata

File hashes

Hashes for livekit_plugins_knowlithic-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9b1c41e18ab10946021a37eea66ead48c47529c1096426146453fd92ff2ebe22
MD5 57ae6ea331f25a6410a032ad287153ef
BLAKE2b-256 a2ebe4b3f65b48fd6e3f84c9c7085419e657ba96b84ff753067b0b1796dfe021

See more details on using hashes here.

File details

Details for the file livekit_plugins_knowlithic-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for livekit_plugins_knowlithic-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3638a47679ca3cc0e0121117bce2e7d8e22f46e89b9b3bc0f4e14f15f7004fc4
MD5 73721969a9f9dfea406bdb6d9350ed77
BLAKE2b-256 1c4309ebf6cf635b71da710fa585d5f57df5e747e2c702f13f6f56ce8caa5828

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page