Skip to main content

LiveKit plugin for deepslate.eu

Project description

deepslate-livekit

License Documentation Python

LiveKit Agents plugin for Deepslate's realtime voice AI API.

deepslate-livekit provides a RealtimeModel implementation for the LiveKit Agents framework, enabling seamless integration with Deepslate's unified voice AI infrastructure — speech-to-speech streaming, server-side VAD, LLM inference, and optional ElevenLabs TTS, all in a single WebSocket connection.


Features

  • Realtime Voice AI Streaming — Low-latency bidirectional audio streaming over WebSockets
  • Server-side VAD — Voice Activity Detection handled by Deepslate with configurable sensitivity
  • Function Tools — Define and invoke tools using LiveKit's @function_tool() decorator
  • Flexible TTS — Server-side TTS via Deepslate-hosted (cloned) voices or ElevenLabs, with automatic context truncation on interruption
  • Automatic Interruption Handling — Truncates the in-flight response when users interrupt

Installation

pip install deepslate-livekit

Requirements

  • Python 3.11 or higher

Dependencies (installed automatically)

  • deepslate-core — Shared Deepslate models and base client
  • livekit-agents>=1.3.8 — LiveKit Agents framework

Prerequisites

Deepslate Account

Sign up at deepslate.eu and set the following environment variables:

DEEPSLATE_VENDOR_ID=your_vendor_id
DEEPSLATE_ORGANIZATION_ID=your_organization_id
DEEPSLATE_API_KEY=your_api_key

ElevenLabs TTS (Optional)

For server-side text-to-speech with automatic interruption handling:

ELEVENLABS_API_KEY=your_elevenlabs_api_key
ELEVENLABS_VOICE_ID=your_voice_id
ELEVENLABS_MODEL_ID=eleven_turbo_v2  # optional

Note: You can alternatively use LiveKit's built-in client-side TTS. However, context truncation on interruption only works with server-side TTS configured via ElevenLabsTtsConfig.


Quick Start

from livekit import agents
from livekit.agents import AgentServer, AgentSession, Agent, room_io

from deepslate.livekit import RealtimeModel, ElevenLabsTtsConfig


class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="You are a helpful voice AI assistant.")


server = AgentServer()


@server.rtc_session()
async def my_agent(ctx: agents.JobContext):
    session = AgentSession(
        llm=RealtimeModel(
            tts_config=ElevenLabsTtsConfig.from_env()
        ),
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_options=room_io.RoomOptions(),
    )

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )


if __name__ == "__main__":
    agents.cli.run_app(server)

Configuration

RealtimeModel

Parameter Type Default Description
vendor_id str env: DEEPSLATE_VENDOR_ID Deepslate vendor ID
organization_id str env: DEEPSLATE_ORGANIZATION_ID Deepslate organization ID
api_key str env: DEEPSLATE_API_KEY Deepslate API key
base_url str "https://app.deepslate.eu" Base URL for Deepslate API
system_prompt str "You are a helpful assistant." System prompt for the model
generate_reply_timeout float 30.0 Timeout in seconds for generate_reply (0 = no limit)
tts_config ElevenLabsTtsConfig | HostedTtsConfig None TTS configuration (enables server-side audio output)

You can also pass a VadConfig instance to tune voice activity detection — see VAD Configuration below.

VAD Configuration

from deepslate.livekit import RealtimeModel, VadConfig

llm = RealtimeModel(
    vad_config=VadConfig(
        confidence_threshold=0.5,   # 0.0–1.0: minimum confidence to classify as speech
        min_volume=0.01,            # 0.0–1.0: minimum volume to classify as speech
        start_duration_ms=200,      # ms of speech required to trigger start
        stop_duration_ms=500,       # ms of silence required to trigger stop
        backbuffer_duration_ms=1000 # ms of audio buffered before detection triggers
    )
)
Parameter Type Default Description
confidence_threshold float 0.5 Minimum confidence to consider audio as speech (0.0–1.0)
min_volume float 0.01 Minimum volume threshold (0.0–1.0)
start_duration_ms int 200 Duration of speech required to detect start (ms)
stop_duration_ms int 500 Duration of silence required to detect end (ms)
backbuffer_duration_ms int 1000 Audio buffer captured before speech detection triggers

Tuning tips:

  • Noisy environments: Increase confidence_threshold (0.6–0.8) and min_volume (0.02–0.05)
  • Lower latency: Decrease start_duration_ms (100–150) and stop_duration_ms (200–300)
  • Natural pacing: Slightly increase stop_duration_ms (600–800)

HostedTtsConfig

Use a voice cloned and hosted within Deepslate. No external TTS credentials required.

from deepslate.livekit import RealtimeModel, HostedTtsConfig, HostedTtsMode

llm = RealtimeModel(
    tts_config=HostedTtsConfig(
        voice_id="c3dfa73f-a1ab-4aad-b48a-0e9b9fe4a69f",
        mode=HostedTtsMode.HIGH_QUALITY,  # or LOW_LATENCY
    )
)
Parameter Type Default Description
voice_id str required ID of the hosted (cloned) voice
mode HostedTtsMode HostedTtsMode.HIGH_QUALITY Quality/latency tradeoff for highest response speed

HostedTtsMode values:

Value Description
HIGH_QUALITY Best output quality with still relatively low latency. Recommended for most use cases (default).
LOW_LATENCY Low latency generation mode that takes next to no time to complete. Output quality may be significantly reduced.

ElevenLabsTtsConfig

Parameter Type Default Description
api_key str env: ELEVENLABS_API_KEY ElevenLabs API key
voice_id str env: ELEVENLABS_VOICE_ID Voice ID (e.g., '21m00Tcm4TlvDq8ikWAM' for Rachel)
model_id str | None env: ELEVENLABS_MODEL_ID Model ID, e.g., 'eleven_turbo_v2'; uses ElevenLabs default if unset
location ElevenLabsLocation ElevenLabsLocation.US Regional API endpoint (US works with all accounts; EU/INDIA require enterprise)

Use ElevenLabsTtsConfig.from_env() to load from environment variables.


Function Tools

Use LiveKit's @function_tool() decorator to expose tools to the model:

from livekit.agents import Agent, function_tool, RunContext
from deepslate.livekit import RealtimeModel


class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="You are a helpful assistant.")

    @function_tool()
    async def get_weather(self, context: RunContext, location: str) -> str:
        """Get the current weather for a given city."""
        # Your implementation here
        return f"It's sunny and 22°C in {location}."

Sending a Welcome Message

DeepslateRealtimeSession emits a "session_initialized" event once the WebSocket session is fully initialized and ready to accept messages. Listen for this event to send a welcome message instead of relying on a fixed delay:

@server.rtc_session()
async def my_agent(ctx: agents.JobContext):
    model = RealtimeModel(tts_config=ElevenLabsTtsConfig.from_env())
    session = AgentSession(llm=model)

    deepslate_session = model.session()
    deepslate_session.on("session_initialized", lambda _: asyncio.create_task(
        deepslate_session.speak_direct("Hello! How can I help you today?")
    ))

    await session.start(room=ctx.room, agent=Assistant())

Examples

The examples/ directory contains a ready-to-run agent you can use as a starting point.

chat_agent.py — Voice assistant with function tools

A fully working LiveKit agent that demonstrates:

  • Connecting to a LiveKit room
  • Server-side ElevenLabs TTS with interruption handling
  • Two example function tools: lookup_weather and get_current_location
packages/livekit/examples/
├── chat_agent.py      # The agent
└── .env.example       # Required environment variables

Setup:

# 1. Install dependencies
pip install deepslate-livekit python-dotenv

# 2. Configure credentials
cd packages/livekit/examples
cp .env.example .env
# Edit .env and fill in your credentials

# 3. Run
python chat_agent.py dev

Documentation


License

Apache License 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepslate_livekit-0.1.7.tar.gz (12.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepslate_livekit-0.1.7-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file deepslate_livekit-0.1.7.tar.gz.

File metadata

  • Download URL: deepslate_livekit-0.1.7.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for deepslate_livekit-0.1.7.tar.gz
Algorithm Hash digest
SHA256 bc09a390ab4157c41f497ad2b8388821ee986ce6921fe2ee44e595836eeae698
MD5 d0ff3d36392764cfc715ba9b2f3dc2f6
BLAKE2b-256 f34515b1e7ff05a1ad41d37b30d440be7dac591fa05802dd4f151807952ab6ac

See more details on using hashes here.

Provenance

The following attestation bundles were made for deepslate_livekit-0.1.7.tar.gz:

Publisher: release.yml on deepslate-labs/deepslate-sdks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file deepslate_livekit-0.1.7-py3-none-any.whl.

File metadata

File hashes

Hashes for deepslate_livekit-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 da15fcb0641a5d43619f0c54887c70f9b3fc96baa8f5780dcc088c7b477e694d
MD5 3afd6bb60001d10404aa8c03796064cd
BLAKE2b-256 510bc115e3540432cb28bfada1aa8b3630677dc011549b0b677e3a9120b0c472

See more details on using hashes here.

Provenance

The following attestation bundles were made for deepslate_livekit-0.1.7-py3-none-any.whl:

Publisher: release.yml on deepslate-labs/deepslate-sdks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page