Skip to main content

Open, vendor-neutral observability for voice AI conversations using OpenTelemetry

Project description

voiceobs

Open, vendor-neutral observability for voice AI conversations.

PyPI version License

The Problem

Voice AI applications are hard to debug. When a conversation goes wrong, you're left asking:

  • Which turn caused the issue?
  • How long did each turn take?
  • Was it the user's input or the agent's response?
  • How do I correlate this with my existing traces?

Traditional logging doesn't capture the temporal, turn-based nature of voice conversations. You need observability that understands voice interactions.

The Solution

voiceobs instruments voice AI conversations as OpenTelemetry spans, giving you:

  • Turn-level visibility: See every user and agent turn as a span
  • Stage-level latency: Track ASR, LLM, and TTS processing separately
  • Failure detection: Automatically identify high latency, interruptions, and errors
  • Conversation correlation: All turns in a conversation share a conversation ID
  • CLI analysis tools: Analyze, compare, and report on trace data
  • Zero config: Works out of the box with console output

30-Second Quickstart

pip install voiceobs
from voiceobs import ensure_tracing_initialized, voice_conversation, voice_turn

# Initialize tracing (uses ConsoleSpanExporter by default)
ensure_tracing_initialized()

# Instrument your conversation
with voice_conversation() as conv:
    print(f"Conversation: {conv.conversation_id}")

    with voice_turn("user"):
        # Process user speech/transcription
        pass

    with voice_turn("agent"):
        # Generate and speak agent response
        pass

Output:

{
    "name": "voice.turn",
    "attributes": {
        "voice.schema.version": "0.0.2",
        "voice.conversation.id": "550e8400-e29b-41d4-a716-446655440000",
        "voice.turn.id": "6ba7b810-9dad-11d1-80b4-00c04fd430c8",
        "voice.turn.index": 0,
        "voice.actor": "user"
    }
}

What's New in v0.0.2

🎯 Decorator API

Reduce boilerplate with decorators:

from voiceobs import voice_conversation_decorator, voice_turn_decorator

@voice_conversation_decorator()
async def handle_call():
    user_input = await get_user_input()
    response = await generate_response(user_input)
    return response

@voice_turn_decorator(actor="agent")
async def generate_response(text):
    # This function is automatically wrapped in a voice turn span
    return await llm.generate(text)

⚙️ Configuration System

Configure voiceobs with a YAML file:

voiceobs init  # Generate voiceobs.yaml
# voiceobs.yaml
exporters:
  jsonl:
    enabled: true
    path: ./traces.jsonl
  console:
    enabled: true

failures:
  thresholds:
    high_latency_ms: 3000
    interruption_rate: 0.1

📊 CLI Analysis Tools

# Analyze latency and failures
voiceobs analyze --input traces.jsonl

# Compare runs and detect regressions
voiceobs compare --baseline baseline.jsonl --current current.jsonl

# Generate shareable reports
voiceobs report --input traces.jsonl --format html --output report.html

🔌 Framework Integrations

Out-of-box support for:

  • LiveKit Agents - Auto-instrument LiveKit voice pipelines
  • Vocode - Auto-instrument Vocode conversations

See examples/ for integration demos.

CLI Demo

See voiceobs in action without writing any code:

voiceobs demo

This simulates a 4-turn conversation and prints the spans to the console.

Check your OpenTelemetry configuration:

voiceobs doctor

Analyze trace files to see latency metrics:

voiceobs analyze --input traces.jsonl

Installation

pip install voiceobs

Requirements: Python 3.9+

Usage

Basic Conversation Tracking

from voiceobs import voice_conversation, voice_turn

with voice_conversation() as conv:
    # User says something
    with voice_turn("user"):
        transcript = transcribe_audio(audio)
        process_user_input(transcript)

    # Agent responds
    with voice_turn("agent"):
        response = generate_response(transcript)
        synthesize_speech(response)

Custom Conversation IDs

# Use your own conversation ID for correlation with other systems
with voice_conversation(conversation_id="call-12345") as conv:
    with voice_turn("user"):
        pass

Nested Turns (System Processing)

with voice_conversation():
    with voice_turn("user"):
        # User turn spans the entire user processing

        with voice_turn("system"):
            # Nested system turn for internal processing
            run_safety_check()

With Existing OpenTelemetry Setup

voiceobs detects and respects your existing OpenTelemetry configuration:

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import BatchSpanProcessor

# Your existing setup
provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(provider)

# voiceobs will use your provider, not override it
from voiceobs import ensure_tracing_initialized
ensure_tracing_initialized()  # Returns False, keeps your config

Accessing Current Context

from voiceobs import get_current_conversation, get_current_turn

with voice_conversation():
    with voice_turn("user"):
        conv = get_current_conversation()
        turn = get_current_turn()

        print(f"Conversation: {conv.conversation_id}")
        print(f"Turn {turn.turn_index} by {turn.actor}")

Span Attributes

Each voice.turn span includes:

Attribute Type Description
voice.schema.version string Schema version (currently "0.0.1")
voice.conversation.id string UUID identifying the conversation
voice.turn.id string UUID identifying this specific turn
voice.turn.index int Sequential turn number (0, 1, 2...)
voice.actor string Who is speaking: "user", "agent", or "system"
voice.silence.after_user_ms float Response latency from user speech end to agent speech start
voice.turn.overlap_ms float Overlap duration in ms (positive = interruption)
voice.interruption.detected bool True if agent started speaking before user finished

JSONL Export and Analysis

Export traces to a JSONL file for offline analysis:

# Enable JSONL export
VOICEOBS_JSONL_OUT=./traces.jsonl python your_app.py

# Analyze the traces
voiceobs analyze --input traces.jsonl

Output shows stage latencies (ASR/LLM/TTS), response latency, and interruption rate:

voiceobs Analysis Report
==================================================

Stage Latencies (ms)
------------------------------
  ASR (n=2):
    mean: 165.2
    p50:  165.2
    p95:  180.0
  LLM (n=2):
    mean: 785.0
    p50:  785.0
    p95:  850.0
  TTS (n=2):
    mean: 300.0
    p50:  300.0
    p95:  320.0

Response Latency (silence after user)
------------------------------
  Samples: 2
  mean: 1115.0ms
  p95:  1250.0ms

Interruptions
------------------------------
  Agent turns: 2
  Interruptions: 0
  Rate: 0.0%

Examples

See the examples/ directory for complete, runnable examples:

Example Description
voice_pipeline Complete voice chat with ASR (Deepgram), LLM (Gemini), TTS (Cartesia)
overlap_detection Demonstrates barge-in and overlap/interruption detection

Each example includes setup instructions, API key configuration, and demonstrates different voiceobs features.

Integrations

Pipecat

See integrations/pipecat-examples/ for a complete example of instrumenting a Pipecat voice bot.

from voiceobs import voice_conversation, voice_turn
from pipecat.processors.frame_processor import FrameProcessor

class VoiceObsUserTurnTracker(FrameProcessor):
    async def process_frame(self, frame, direction):
        if isinstance(frame, TranscriptionFrame):
            # Track user turn
            self._turn = voice_turn("user")
            self._turn.__enter__()
        await self.push_frame(frame, direction)

API Reference

voice_conversation(conversation_id: Optional[str] = None)

Context manager for a voice conversation. Auto-generates UUID if not provided.

voice_turn(actor: Literal["user", "agent", "system"])

Context manager for a voice turn. Creates an OpenTelemetry span with voice attributes.

ensure_tracing_initialized() -> bool

Safely initializes tracing with ConsoleSpanExporter if no provider exists. Returns True if initialized, False if existing config was kept.

get_tracer_provider_info() -> dict

Returns diagnostic info about the current tracer provider.

get_current_conversation() -> Optional[ConversationContext]

Returns the current conversation context, or None if outside a conversation.

get_current_turn() -> Optional[TurnContext]

Returns the current turn context, or None if outside a turn.

Development

# Clone the repository
git clone https://github.com/voiceobs/voiceobs.git
cd voiceobs

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Run linting
ruff check src tests

License

Apache-2.0 - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voiceobs-0.0.2.tar.gz (921.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voiceobs-0.0.2-py3-none-any.whl (61.2 kB view details)

Uploaded Python 3

File details

Details for the file voiceobs-0.0.2.tar.gz.

File metadata

  • Download URL: voiceobs-0.0.2.tar.gz
  • Upload date:
  • Size: 921.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for voiceobs-0.0.2.tar.gz
Algorithm Hash digest
SHA256 c7bd44ee1589a21a16cf8481b1f653e110eb8c4977639f47a5ca328528f3b616
MD5 945854e956d394b4f77f33e626e7d4eb
BLAKE2b-256 0dad59d8ec3b7c3fcbeeb63631906ec194deccbac25f02ade52cc3ef61a32478

See more details on using hashes here.

File details

Details for the file voiceobs-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: voiceobs-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 61.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for voiceobs-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cea460ec6a3cb213ffaad3ceec7309b466606cbfcaf7d54fd9148cd700b504b7
MD5 837314e8153995ec8729f72c54f70c3a
BLAKE2b-256 30ba835e2a13afc63a6a9066c01e65eca6f897f2595033b6f7ed0e4f2ad8cde7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page