Observability framework for Pipecat voice and multimodal conversational AI

These details have not been verified by PyPI

Project links

Project description

Voiceground

Observability framework for Pipecat voice and multimodal conversational AI.

Features

Call Simulation: Test your bots with dynamic, LLM-powered simulated users
VoicegroundObserver: Track conversation events following Pipecat's Observer pattern

Installation

pip install voiceground

Or with UV:

uv add voiceground

Call Simulation

Voiceground includes a call simulation feature for testing your bots with dynamic, LLM-powered simulated users. Instead of manual testing, you can define user personas and goals, and let the simulator have realistic conversations with your bot.

Voiceground Simulation

Quick Start

from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.openai.stt import OpenAISTTService
from pipecat.services.openai.tts import OpenAITTSService
from voiceground.simulation import VoicegroundSimulation, VoicegroundSimulatorConfig

# Configure the simulated user
config = VoicegroundSimulatorConfig(
    llm=OpenAILLMService(api_key=openai_key, model="gpt-4o-mini"),
    tts=OpenAITTSService(api_key=openai_key, voice="echo"),
    stt=OpenAISTTService(api_key=openai_key),
    system_prompt="""
        You are a customer calling to book a restaurant table.
        Keep your answers short and let the other side lead the conversation.
        Your goal: Book a table for 2 people tomorrow at 7pm.
        Be natural and conversational.
    """,
    initiate_conversation=True,  # Simulator speaks first
    max_turns=3,
    timeout_seconds=45,
)

# Run simulation
async with VoicegroundSimulation(config) as simulation:
    await run_bot(transport=simulation.transport)

# Results available after context exits
print(simulation.results.transcript)
print(f"Turns: {simulation.results.turn_count}")

Your run_bot function just needs to accept a transport parameter, as a drop in replacement:

async def run_bot(transport):
    # Use transport.input() and transport.output() - same as LocalAudioTransport!
    pipeline = Pipeline([
        transport.input(),
        stt, llm, tts,
        transport.output(),
    ])
    runner = PipelineRunner()
    await runner.run(PipelineTask(pipeline))

The simulation automatically handles turn limiting and timeouts - no extra configuration needed on the bot side.

Note: Simulations run faster than real-time because audio input/output is not buffered. This allows for rapid testing and iteration, but timing metrics may not reflect real-world performance characteristics.

Architecture

┌───────────────────────────┐          ┌───────────────────────────┐
│   Simulator Pipeline      │          │     Bot Pipeline          │
│   (The "Fake User")       │          │   (Your actual bot)       │
│                           │          │                           │
│   STT ◄───────────────────┼── audio ─┼─── TTS                    │
│    ↓                      │          │     ↑                     │
│   LLM (user persona)      │          │    LLM                    │
│    ↓                      │          │     ↑                     │
│   TTS ────────────────────┼── audio ─┼──► STT                    │
│                           │          │                           │
└───────────────────────────┘          └───────────────────────────┘
                  VoicegroundBridgeTransport

Both pipelines are standard Pipecat pipelines connected via VoicegroundBridgeTransport. The simulator's LLM has a system prompt that tells it to act as a user with specific goals.

VoicegroundSimulatorConfig Options

Option	Type	Description
`llm`	`LLMService`	LLM for generating user responses
`tts`	`TTSService`	TTS for generating user voice
`stt`	`STTService`	STT for transcribing bot speech
`system_prompt`	`str`	Instructions for the simulated user persona
`initiate_conversation`	`bool`	If True, simulator speaks first (default: False)
`max_turns`	`int`	Maximum conversation turns (default: 10)
`timeout_seconds`	`float`	Maximum simulation duration (default: 120)

VoicegroundSimulationResults

After the simulation completes, simulation.results contains:

transcript: List of VoicegroundTranscriptEntry objects with role, text, and timestamp
events: All VoicegroundEvent objects captured during simulation
turn_count: Number of completed conversation turns
duration_seconds: Total simulation duration
termination_reason: Why the simulation ended (max_turns, timeout, or unknown)

VoicegroundObserver

Track conversation events following Pipecat's Observer pattern for observability and debugging.

Voiceground Observer

Quick Start

import uuid
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from voiceground import VoicegroundObserver, HTMLReporter

# Create observer with HTML reporter
conversation_id = str(uuid.uuid4())
reporter = HTMLReporter(output_dir="reports")
observer = VoicegroundObserver(
    reporters=[reporter],
    conversation_id=conversation_id
)

# Create pipeline task with observer
task = PipelineTask(
    pipeline=Pipeline([...]),
    observers=[observer]
)

# Run your pipeline

Tested With

Voiceground has been tested with the following Pipecat providers:

LLM Providers

OpenAI

STT Providers

ElevenLabs
OpenAI

TTS Providers

ElevenLabs
OpenAI

Event Categories

Voiceground tracks the following event categories:

Category	Types	Description
`user_speak`	`start`, `end`	User speech events
`bot_speak`	`start`, `end`	Bot speech events
`stt`	`start`, `end`	Speech-to-text processing (includes transcription text)
`llm`	`start`, `first_byte`, `end`	LLM response generation (includes generated text)
`tts`	`start`, `first_byte`, `end`	Text-to-speech synthesis
`tool_call`	`start`, `end`	LLM function/tool calling
`system`	`start`, `end`	System events (e.g., context aggregation)

Opinionated Metrics

Voiceground tracks 7 opinionated metrics per conversation turn, providing comprehensive insights into voice conversation performance:

Turn Duration: Total time from the first event to the last event in the turn (milliseconds). Measures the complete duration of a conversation turn.
Response Time: Time from user_speak:end to bot_speak:start (or from the first event to bot_speak:start if the conversation started with bot speech). This is the end-to-end time the user experiences waiting for a response.
Transcription Overhead: Time from user_speak:end to stt:end (milliseconds). Measures the latency of speech-to-text processing.
Voice Synthesis Overhead: Time from tts:start to bot_speak:start (milliseconds). Measures the latency of text-to-speech synthesis.
LLM Response Time: Time from llm:start to llm:first_byte (milliseconds). Measures the time-to-first-byte for the LLM response, indicating how quickly the model starts generating content.
System Overhead: Time from stt:end to llm:start (milliseconds). Measures context aggregation and other system processing that occurs between transcription and LLM invocation. Includes labels/metadata about the system operations.
Tools Overhead: Sum of all individual tool_call durations (each tool_call:end - tool_call:start) that occur between llm:start and llm:end (milliseconds). Measures the total time spent executing function/tool calls during LLM processing.

Metric Relationships

The metrics are related as follows:

Response Time ≈ Transcription Overhead + System Overhead + LLM Response Time + Tools Overhead + Voice Synthesis Overhead
Turn Duration includes all events in the turn and may be longer than Response Time if there are additional events before or after the main response flow

Report Features

The generated HTML reports include:

Timeline Visualization: Interactive timeline showing all events and their relationships
Events Table: Detailed view of all tracked events with timestamps, sources, and data
Turns Table: Conversation turns with all 7 opinionated performance metrics
Metrics Summary: Average metrics across the conversation
Event Highlighting: Hover over events or turns to see related events highlighted

Examples

See the examples/ directory for complete working examples:

Bot Implementations

bots/restaurant_bot.py: Restaurant booking assistant bot
bots/friendly_assistant_bot.py: General-purpose friendly assistant bot

Both bots accept STT, LLM, and TTS services as parameters for flexibility.

Runner Scripts

simulations/run_openai_simulation.py: Call simulation with a restaurant booking scenario using OpenAI
observer/run_openai_restaurant_bot.py: Restaurant bot with OpenAI services (STT, LLM, TTS)
observer/run_elevenlabs_restaurant_bot.py: Restaurant bot with ElevenLabs (STT, TTS) and OpenAI (LLM)

To run an example:

# Install example dependencies
uv sync --all-extras

# Set required environment variables
export OPENAI_API_KEY=your_key
export ELEVENLABS_API_KEY=your_key  # For ElevenLabs examples

# Run a simulation (recommended first step)
python examples/simulations/run_openai_simulation.py

# Run a restaurant bot example
python examples/observer/run_openai_restaurant_bot.py
# or
python examples/observer/run_elevenlabs_restaurant_bot.py

Note: On macOS, you'll need to install portaudio for audio support:

brew install portaudio

Development

# Clone the repository
git clone https://github.com/poseneror/voiceground.git
cd voiceground

# Install all dependencies (including dev and examples)
uv sync --all-extras

# Run tests
uv run pytest

# Run linting
uv run ruff check .

# Run type checking
uv run mypy src

# Build the client
python scripts/develop.py build

# Run example (requires portaudio on macOS: brew install portaudio)
python scripts/develop.py example

License

BSD-2-Clause License - see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.8

Feb 15, 2026

0.1.7

Feb 15, 2026

0.1.6

Jan 25, 2026

This version

0.1.6b1 pre-release

Jan 22, 2026

0.1.5

Jan 20, 2026

0.1.4

Jan 10, 2026

0.1.3

Jan 9, 2026

0.1.2.dev0 pre-release

Jan 3, 2026

0.1.1.post1.dev0 pre-release

Jan 3, 2026

0.1.1a2.dev0 pre-release

Jan 2, 2026

0.1.0

Jan 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voiceground-0.1.6b1.tar.gz (10.0 MB view details)

Uploaded Jan 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voiceground-0.1.6b1-py3-none-any.whl (155.0 kB view details)

Uploaded Jan 22, 2026 Python 3

File details

Details for the file voiceground-0.1.6b1.tar.gz.

File metadata

Download URL: voiceground-0.1.6b1.tar.gz
Upload date: Jan 22, 2026
Size: 10.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for voiceground-0.1.6b1.tar.gz
Algorithm	Hash digest
SHA256	`845a5658d756dac00f94746942f060da152e9278b50a5b765530ea3613163622`
MD5	`ecf64525020b10ce46d088c48b7bc65a`
BLAKE2b-256	`2f00e8b24d7990a10733fd19850343f6dc9c64d37e08dc0e6dcdc9fdc2147c34`

See more details on using hashes here.

File details

Details for the file voiceground-0.1.6b1-py3-none-any.whl.

File metadata

Download URL: voiceground-0.1.6b1-py3-none-any.whl
Upload date: Jan 22, 2026
Size: 155.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for voiceground-0.1.6b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`71865a18ba5c8b0c9a637f09872cb99bc402a0e3545a3bc7bad9ae76fbd828f5`
MD5	`6036d10de53e7f0a4e859e6e009d98a1`
BLAKE2b-256	`22bda7c5443dafd0535cd5ec6339739cdeb2f2e1a5009d4713165c4a6bd525bc`

See more details on using hashes here.

voiceground 0.1.6b1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Voiceground

Features

Installation

Call Simulation

Quick Start

Architecture

VoicegroundSimulatorConfig Options

VoicegroundSimulationResults

VoicegroundObserver

Quick Start

Tested With

LLM Providers

STT Providers

TTS Providers

Event Categories

Opinionated Metrics

Metric Relationships

Report Features

Examples

Bot Implementations

Runner Scripts

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes