Enterprise-grade Observability and Evaluation SDK for Voice Agents
Project description
VoiceEval SDK (Python)
VoiceEval is an enterprise-grade observability and evaluation SDK for Voice Agents and LLM-powered applications. Built on OpenTelemetry, it provides zero-config auto-instrumentation with detailed tracing, latency breakdown, and cost analysis.
Key Features
- Zero-Config Auto-Instrumentation: Automatically traces calls from major LLM providers (OpenAI, Anthropic, Google Gemini) and LiveKit Agents — no code changes needed.
- LiveKit Native: Automatically integrates with LiveKit's tracing infrastructure. Just initialize the Client and all agent spans are captured.
- Selective Monitoring: Control which calls are traced with
auto_monitor,sample_rate,monitor_call(), andskip_call(). - High Performance: Built on OpenTelemetry with async batch exports (OTLP/HTTP), ensuring negligible runtime overhead.
Installation
pip install voiceeval-sdk
# or
uv add voiceeval-sdk
Quickstart
1. Initialize the Client
Add a single Client(...) call at the top of your agent file. This sets up OTel tracing and auto-instruments all installed LLM libraries and LiveKit.
from voiceeval import Client
client = Client(
api_key="your_voiceeval_api_key", # or set VOICE_EVAL_API_KEY env var
agent_name="my-booking-agent", # identifies this agent in the dashboard
)
2. LiveKit Agent Example
from livekit.agents import Agent, AgentSession, JobContext, cli
from voiceeval import Client
# Initialize VoiceEval — auto-instruments all LLM calls and LiveKit spans
client = Client(
api_key="your_voiceeval_api_key",
agent_name="my-booking-agent",
)
class MyAgent(Agent):
def __init__(self):
super().__init__(instructions="You are a helpful voice assistant.")
@server.rtc_session(agent_name="my-agent")
async def entrypoint(ctx: JobContext):
session = AgentSession(
stt=...,
llm=...,
tts=...,
)
await session.start(agent=MyAgent(), room=ctx.room)
await ctx.connect()
3. Standalone LLM Example
Works without LiveKit too — any OpenAI/Anthropic/Gemini calls are automatically traced:
from voiceeval import Client
from openai import OpenAI
client = Client(api_key="your_voiceeval_api_key")
openai_client = OpenAI()
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello world"}]
)
# Trace is automatically captured and exported
Client Options
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key |
str |
VOICE_EVAL_API_KEY env var |
Your VoiceEval API key |
base_url |
str |
https://api.voiceeval.com/v1/traces |
VoiceEval ingestion endpoint |
agent_name |
str |
None |
Agent identifier shown in the dashboard |
auto_monitor |
bool |
True |
Monitor all calls automatically |
sample_rate |
float |
1.0 |
Fraction of calls to monitor (0.0 to 1.0) |
span_post_processors |
list |
None |
Custom span post-processing functions |
Selective Monitoring
By default, every call is monitored (auto_monitor=True). You can control this at the client level or per-call.
Sample a fraction of calls
client = Client(
api_key="your_voiceeval_api_key",
agent_name="my-booking-agent",
sample_rate=0.1, # Randomly monitor 10% of calls
)
Skip specific calls
With the default auto_monitor=True, all calls are monitored. Use skip_call() inside your session handler to opt out a specific call:
from voiceeval import Client, skip_call
client = Client(
api_key="your_voiceeval_api_key",
agent_name="my-booking-agent",
)
@server.rtc_session(agent_name="my-agent")
async def entrypoint(ctx: JobContext):
# Decide based on room metadata, participant info, etc.
if ctx.room.name.startswith("internal-"):
skip_call() # This call won't be monitored or evaluated
session = AgentSession(stt=..., llm=..., tts=...)
await session.start(agent=MyAgent(), room=ctx.room)
await ctx.connect()
Monitor only specific calls
Set auto_monitor=False so no calls are monitored by default, then use monitor_call() to opt in:
from voiceeval import Client, monitor_call
client = Client(
api_key="your_voiceeval_api_key",
agent_name="my-booking-agent",
auto_monitor=False,
)
@server.rtc_session(agent_name="my-agent")
async def entrypoint(ctx: JobContext):
# Only monitor production calls, not test rooms
if not ctx.room.name.startswith("test-"):
monitor_call() # This call will be traced and evaluated
session = AgentSession(stt=..., llm=..., tts=...)
await session.start(agent=MyAgent(), room=ctx.room)
await ctx.connect()
When a call is skipped (or not opted in), spans still flow to Langfuse for the dashboard but won't create backend records or trigger evaluations.
Manual Tracing (Optional)
For non-LLM functions like business logic or RAG pipelines, use the @observe decorator:
from voiceeval import observe
@observe(name_override="rag_retrieval")
def retrieve_documents(query: str):
# Your logic here
return docs
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voiceeval_sdk-0.1.9.tar.gz.
File metadata
- Download URL: voiceeval_sdk-0.1.9.tar.gz
- Upload date:
- Size: 67.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f5ada785fbc87cfa117cbfaf5896dffd0be38b49287bbc72304418db56a3736
|
|
| MD5 |
0266a7739cca6a5d55548967c00eff36
|
|
| BLAKE2b-256 |
fae431cb93b77eba29ea93042c11deeae0816c2cb7fb7944d1a833911302bea9
|
File details
Details for the file voiceeval_sdk-0.1.9-py3-none-any.whl.
File metadata
- Download URL: voiceeval_sdk-0.1.9-py3-none-any.whl
- Upload date:
- Size: 17.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ebd146169c2b1e46fc13879502aeba271d1f8b046575962f67568621e606e0d
|
|
| MD5 |
d0a4899e62bf0833e13fc9a128cf15a2
|
|
| BLAKE2b-256 |
c911951e9351250d9798125890de125aedb2b2061640be0a88ca5a03da890a3f
|