Build real-time AI voice agents in Python. Zero Runtime runs the speech-to-speech pipeline (STT, LLM, TTS) for you.

These details have not been verified by PyPI

Project links

Project description

ZRT — Zero Runtime Python SDK

Build real-time AI voice agents in Python — without running the infrastructure. You write the agent (instructions, tools, logic); Zero Runtime runs the live speech-to-speech pipeline — speech-to-text → LLM → text-to-speech, with turn detection, denoising, and interruptions — at low latency in the cloud.

Write the agent. We run the runtime.

A different kind of voice SDK

Most voice frameworks make you run the hard part — media servers, GPUs, turn-taking, autoscaling. No-code platforms hide all that but lock you into a dashboard. Zero Runtime is the middle: real Python and your own providers, with none of the real-time infrastructure to operate.

	Self-hosted frameworks	No-code platforms	Zero Runtime
Write real Python + custom tools	✅	❌ (dashboard)	✅
Run media servers / GPUs / scaling	❌ you run it	✅ managed	✅ managed
Swap any STT / LLM / TTS provider	✅	limited	✅
Low-latency speech-to-speech	you tune it	managed	managed

Requirements

Python 3.11+
A ZRT runtime endpoint + auth token (from your Zero Runtime account)
API key(s) for the providers you use (e.g. Deepgram, Google, Cartesia)

Install

pip install --pre zrt

Public beta — --pre is required until the stable release.

Quickstart

1. Set your environment

export ZRT_RUNTIME_ADDRESS=us1.rt.zeroruntime.ai:443   # your ZRT runtime
export ZRT_AUTH_TOKEN=<your-token>

export DEEPGRAM_API_KEY=<key>    # speech-to-text
export GOOGLE_API_KEY=<key>      # the LLM (Gemini)
export CARTESIA_API_KEY=<key>    # text-to-speech

2. Write your agent — agent.py

from zrt.agents import (
    Agent, AgentSession, Pipeline, WorkerJob, JobContext, RoomOptions,
    EOUConfig, InterruptConfig,
)
from zrt.plugins.deepgram import DeepgramSTT
from zrt.plugins.google import GoogleLLM
from zrt.plugins.cartesia import CartesiaTTS
from zrt.plugins.silero import SileroVAD
from zrt.plugins.turn_detector import NamoTurnDetectorV1
from zrt.plugins.rnnoise import RNNoise

IGNORE_PATTERNS = [r"\b(uh+|um+)\b"]   # filler words to drop from transcripts


class Assistant(Agent):
    def __init__(self):
        super().__init__(instructions="You are a friendly voice assistant. Keep replies short.")

    async def on_enter(self):
        await self.session.say("Hi! How can I help?")

    async def on_exit(self):
        pass


async def entrypoint(ctx: JobContext):
    session = AgentSession(
        agent=Assistant(),
        pipeline=Pipeline(
            stt=DeepgramSTT(),
            llm=GoogleLLM(
                model="gemini-2.5-flash",
                thinking_budget=0,
                include_thoughts=False,
                max_output_tokens=8192,
            ),
            tts=CartesiaTTS(),
            vad=SileroVAD(threshold=0.4),
            turn_detector=NamoTurnDetectorV1(language="en", threshold=0.8),
            denoise=RNNoise(),
            eou_config=EOUConfig(mode="ADAPTIVE", min_max_speech_wait_timeout=[0.1, 0.3]),
            interrupt_config=InterruptConfig(
                interrupt_min_duration=0.5,
                interrupt_min_words=2,
                resume_on_false_interrupt=True,
            ),
            stt_filter_patterns=IGNORE_PATTERNS,
            stt_word_substitutions={"recording": "", "recorded": ""},
        ),
    )
    await session.start(wait_for_participant=True, run_until_shutdown=True)


if __name__ == "__main__":
    WorkerJob(
        entrypoint=entrypoint,
        jobctx=lambda: JobContext(room_options=RoomOptions(name="Assistant")),
    ).start()

3. Run it

python agent.py

That's it — speech in → your agent → speech out, in real time.

How it works

Piece	What it is
`Agent`	Your behavior — instructions, tools, what it says on enter/exit.
`Pipeline`	The voice stack: STT (hear) → LLM (think) → TTS (speak), plus VAD, turn detection, and denoising.
`WorkerJob`	Runs your agent and connects it to Zero Runtime.

Give your agent tools

Let the LLM call your Python functions — just decorate them:

from zrt.agents import function_tool

@function_tool
async def get_weather(city: str) -> dict:
    """Get the weather for a city.

    Args:
        city: City name
    """
    return {"city": city, "temp_c": 22}

# then pass them to your agent:
#   super().__init__(instructions="...", tools=[get_weather])

Your tool runs in your worker; the runtime calls it when the LLM decides to.

Providers

Mix and match — bring the best model for each stage, swap any one in a line:

Speech-to-text (STT): Deepgram, AssemblyAI, Google, Azure, Gladia, NVIDIA, Sarvam
LLM: OpenAI, Google Gemini, Anthropic Claude, Groq, Cerebras, xAI Grok, Sarvam
Text-to-speech (TTS): Cartesia, ElevenLabs, Google, AWS Polly, Azure, Deepgram, Rime, LMNT, Neuphonic, Hume AI, Inworld, Murf, Resemble, Smallest, Speechify, CambAI, NVIDIA
Realtime speech-to-speech: OpenAI Realtime, Gemini Live, Ultravox, Azure Voice Live
Turn detection: Namo · VAD: Silero · Denoise: RNNoise

from zrt.plugins.elevenlabs import ElevenLabsTTS   # different TTS
from zrt.plugins.anthropic import AnthropicLLM      # different LLM

Use cases

Phone & telephony agents, IVR replacement, customer-support voice bots, voice assistants, outbound/inbound call automation, and any real-time conversational AI.

FAQ

How is this different from a voice-agent framework?

Frameworks make you host and scale the real-time runtime (media, GPUs, turn-taking). ZRT runs that for you — you only write and deploy the agent.

How is it different from a no-code voice platform?

You write real Python with your own tools, logic, and providers — not a dashboard configuration. Full code control, zero infrastructure.

Can I use my own STT / LLM / TTS providers?

Yes — mix any supported providers, and bring your own API keys.

What do I need to run it?

A ZRT runtime endpoint + token and the provider keys for the stages you use.

Examples

More complete examples: https://github.com/ZeroRuntimeAI/zrt-python-sdk-examples

Contact

support@videosdk.live

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.0.1b1 pre-release

Jun 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zrt-0.0.1b1.tar.gz (127.8 kB view details)

Uploaded Jun 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

zrt-0.0.1b1-py3-none-any.whl (161.1 kB view details)

Uploaded Jun 11, 2026 Python 3

File details

Details for the file zrt-0.0.1b1.tar.gz.

File metadata

Download URL: zrt-0.0.1b1.tar.gz
Upload date: Jun 11, 2026
Size: 127.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for zrt-0.0.1b1.tar.gz
Algorithm	Hash digest
SHA256	`af47adf8c0b4082d3d23a17c55ce3c2b3ff02cd0e811632f8553056c7f226552`
MD5	`7ac6a1d7f02299117b4c9e1c9f20d754`
BLAKE2b-256	`637cbd8fe754107ebc9ccfb79a90654e182bcc432195ac504acb5bcf40f0c558`

See more details on using hashes here.

File details

Details for the file zrt-0.0.1b1-py3-none-any.whl.

File metadata

Download URL: zrt-0.0.1b1-py3-none-any.whl
Upload date: Jun 11, 2026
Size: 161.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for zrt-0.0.1b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0afee24b2fede879589e0d61c750ff51f0f11492e36f2097175adffcfb5127e0`
MD5	`efce4fab0fd77951e2999c018053943b`
BLAKE2b-256	`bf106a73607a7381b2b0fa80b6c9eaea1354209587a4c2091e859c0b8107fa91`

See more details on using hashes here.

zrt 0.0.1b1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ZRT — Zero Runtime Python SDK

A different kind of voice SDK

Requirements

Install

Quickstart

How it works

Give your agent tools

Providers

Use cases

FAQ

Examples

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes