Skip to main content

LiveKit plugin for deepslate.eu

Project description

deepslate-livekit

License Documentation Python

LiveKit Agents plugin for Deepslate's realtime voice AI API.

deepslate-livekit provides a RealtimeModel implementation for the LiveKit Agents framework, enabling seamless integration with Deepslate's unified voice AI infrastructure — speech-to-speech streaming, server-side VAD, LLM inference, and optional ElevenLabs TTS, all in a single WebSocket connection.


Features

  • Realtime Voice AI Streaming — Low-latency bidirectional audio streaming over WebSockets
  • Server-side VAD — Voice Activity Detection handled by Deepslate with configurable sensitivity
  • Function Tools — Define and invoke tools using LiveKit's @function_tool() decorator
  • Flexible TTS — Server-side TTS via Deepslate-hosted (cloned) voices or ElevenLabs, with automatic context truncation on interruption
  • Automatic Interruption Handling — Truncates the in-flight response when users interrupt

Installation

pip install deepslate-livekit

Requirements

  • Python 3.11 or higher

Dependencies (installed automatically)

  • deepslate-core — Shared Deepslate models and base client
  • livekit-agents>=1.3.8 — LiveKit Agents framework

Prerequisites

Deepslate Account

Sign up at deepslate.eu and set the following environment variables:

DEEPSLATE_VENDOR_ID=your_vendor_id
DEEPSLATE_ORGANIZATION_ID=your_organization_id
DEEPSLATE_API_KEY=your_api_key

ElevenLabs TTS (Optional)

For server-side text-to-speech with automatic interruption handling:

ELEVENLABS_API_KEY=your_elevenlabs_api_key
ELEVENLABS_VOICE_ID=your_voice_id
ELEVENLABS_MODEL_ID=eleven_turbo_v2  # optional

Note: You can alternatively use LiveKit's built-in client-side TTS. However, context truncation on interruption only works with server-side TTS configured via ElevenLabsTtsConfig.


Quick Start

from livekit import agents
from livekit.agents import AgentServer, AgentSession, Agent, room_io

from deepslate.livekit import RealtimeModel, ElevenLabsTtsConfig


class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="You are a helpful voice AI assistant.")


server = AgentServer()


@server.rtc_session()
async def my_agent(ctx: agents.JobContext):
    session = AgentSession(
        llm=RealtimeModel(
            tts_config=ElevenLabsTtsConfig.from_env()
        ),
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_options=room_io.RoomOptions(),
    )

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )


if __name__ == "__main__":
    agents.cli.run_app(server)

Configuration

RealtimeModel

Parameter Type Default Description
vendor_id str env: DEEPSLATE_VENDOR_ID Deepslate vendor ID
organization_id str env: DEEPSLATE_ORGANIZATION_ID Deepslate organization ID
api_key str env: DEEPSLATE_API_KEY Deepslate API key
base_url str "https://app.deepslate.eu" Base URL for Deepslate API
system_prompt str "You are a helpful assistant." System prompt for the model
generate_reply_timeout float 30.0 Timeout in seconds for generate_reply (0 = no limit)
tts_config ElevenLabsTtsConfig | HostedTtsConfig None TTS configuration (enables server-side audio output)

You can also pass a VadConfig instance to tune voice activity detection — see VAD Configuration below.

VAD Configuration

from deepslate.livekit import RealtimeModel, VadConfig

llm = RealtimeModel(
    vad_config=VadConfig(
        confidence_threshold=0.5,   # 0.0–1.0: minimum confidence to classify as speech
        min_volume=0.01,            # 0.0–1.0: minimum volume to classify as speech
        start_duration_ms=200,      # ms of speech required to trigger start
        stop_duration_ms=500,       # ms of silence required to trigger stop
        backbuffer_duration_ms=1000 # ms of audio buffered before detection triggers
    )
)
Parameter Type Default Description
confidence_threshold float 0.5 Minimum confidence to consider audio as speech (0.0–1.0)
min_volume float 0.01 Minimum volume threshold (0.0–1.0)
start_duration_ms int 200 Duration of speech required to detect start (ms)
stop_duration_ms int 500 Duration of silence required to detect end (ms)
backbuffer_duration_ms int 1000 Audio buffer captured before speech detection triggers

Tuning tips:

  • Noisy environments: Increase confidence_threshold (0.6–0.8) and min_volume (0.02–0.05)
  • Lower latency: Decrease start_duration_ms (100–150) and stop_duration_ms (200–300)
  • Natural pacing: Slightly increase stop_duration_ms (600–800)

HostedTtsConfig

Use a voice cloned and hosted within Deepslate. No external TTS credentials required.

from deepslate.livekit import RealtimeModel, HostedTtsConfig, HostedTtsMode

llm = RealtimeModel(
    tts_config=HostedTtsConfig(
        voice_id="c3dfa73f-a1ab-4aad-b48a-0e9b9fe4a69f",
        mode=HostedTtsMode.HIGH_QUALITY,  # or LOW_LATENCY
    )
)
Parameter Type Default Description
voice_id str required ID of the hosted (cloned) voice
mode HostedTtsMode HostedTtsMode.HIGH_QUALITY Quality/latency tradeoff for highest response speed

HostedTtsMode values:

Value Description
HIGH_QUALITY Best output quality with still relatively low latency. Recommended for most use cases (default).
LOW_LATENCY Low latency generation mode that takes next to no time to complete. Output quality may be significantly reduced.

ElevenLabsTtsConfig

Parameter Type Default Description
api_key str env: ELEVENLABS_API_KEY ElevenLabs API key
voice_id str env: ELEVENLABS_VOICE_ID Voice ID (e.g., '21m00Tcm4TlvDq8ikWAM' for Rachel)
model_id str | None env: ELEVENLABS_MODEL_ID Model ID, e.g., 'eleven_turbo_v2'; uses ElevenLabs default if unset
location ElevenLabsLocation ElevenLabsLocation.US Regional API endpoint (US works with all accounts; EU/INDIA require enterprise)

Use ElevenLabsTtsConfig.from_env() to load from environment variables.


Function Tools

Use LiveKit's @function_tool() decorator to expose tools to the model:

from livekit.agents import Agent, function_tool, RunContext
from deepslate.livekit import RealtimeModel


class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="You are a helpful assistant.")

    @function_tool()
    async def get_weather(self, context: RunContext, location: str) -> str:
        """Get the current weather for a given city."""
        # Your implementation here
        return f"It's sunny and 22°C in {location}."

Sending a Welcome Message

DeepslateRealtimeSession emits a "session_initialized" event once the WebSocket session is fully initialized and ready to accept messages. Listen for this event to send a welcome message instead of relying on a fixed delay:

@server.rtc_session()
async def my_agent(ctx: agents.JobContext):
    model = RealtimeModel(tts_config=ElevenLabsTtsConfig.from_env())
    session = AgentSession(llm=model)

    deepslate_session = model.session()
    deepslate_session.on("session_initialized", lambda _: asyncio.create_task(
        deepslate_session.speak_direct("Hello! How can I help you today?")
    ))

    await session.start(room=ctx.room, agent=Assistant())

Examples

The examples/ directory contains a ready-to-run agent you can use as a starting point.

chat_agent.py — Voice assistant with function tools

A fully working LiveKit agent that demonstrates:

  • Connecting to a LiveKit room
  • Server-side ElevenLabs TTS with interruption handling
  • Two example function tools: lookup_weather and get_current_location
packages/livekit/examples/
├── chat_agent.py      # The agent
└── .env.example       # Required environment variables

Setup:

# 1. Install dependencies
pip install deepslate-livekit python-dotenv

# 2. Configure credentials
cd packages/livekit/examples
cp .env.example .env
# Edit .env and fill in your credentials

# 3. Run
python chat_agent.py dev

Documentation


License

Apache License 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepslate_livekit-0.1.8.tar.gz (12.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepslate_livekit-0.1.8-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file deepslate_livekit-0.1.8.tar.gz.

File metadata

  • Download URL: deepslate_livekit-0.1.8.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for deepslate_livekit-0.1.8.tar.gz
Algorithm Hash digest
SHA256 f1de51c8ece0bf4b9d96491fac72a7f961fee17114e47c197898a77530ad605e
MD5 f7b890e82487ae0844fcfe6524d847dc
BLAKE2b-256 776a7279bf9d6d8cd67005f5ab8499d9a8a634cc06ad9fb660240bb5c68f85d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for deepslate_livekit-0.1.8.tar.gz:

Publisher: release.yml on deepslate-labs/deepslate-sdks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file deepslate_livekit-0.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for deepslate_livekit-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 92984cf88eb20b18d0faba83853b7e5d193d99621328c5f32dedda6518e2ee81
MD5 e2983d536220aafdc445295a04d26d35
BLAKE2b-256 2f68c292ce3c1c06eb808f35ee99129e2f1745d1d39264441603eb1f625ec217

See more details on using hashes here.

Provenance

The following attestation bundles were made for deepslate_livekit-0.1.8-py3-none-any.whl:

Publisher: release.yml on deepslate-labs/deepslate-sdks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page