Official Python SDK for TruGen AI - Real-time AI avatar streaming
Project description
TruGen AI Python SDK
Official Python SDK for TruGen AI - Real-time AI avatar streaming.
All WebRTC and audio/video processing complexity (LiveKit, Acoustic Echo Cancellation, decoding, threading) is handled under the hood. You only ever need to import from trugen.
Installation
# Using uv (recommended)
uv add trugen-sdk
# With optional display utilities (for OpenCV and audio playback testing)
uv add trugen-sdk --extra display
# Using pip
pip install trugen-sdk
# With optional display utilities (for OpenCV and audio playback testing)
pip install trugen-sdk[display]
Quick Start
Simple OpenCV Video Display (Using TruGenRunner)
For most UI/desktop applications, TruGenRunner handles spawning a background event loop thread for the session, while serving BGR video frames and state to the main thread safely.
import cv2
import os
from trugen import TruGenClient, TruGenRunner
# 1. Define how to connect to the session
async def create_session():
client = TruGenClient(api_key=os.getenv("TRUGEN_API_KEY", ""))
session = await client.create_session(agent_id=os.getenv("TRUGEN_AGENT_ID", ""))
await session.connect()
await session.enable_audio_output() # Speaker + AEC in one call
return session
# 2. Initialize the runner
runner = TruGenRunner(session_factory=create_session)
# 3. Handle incoming frames
@runner.on_frame
def show_frame(frame):
if frame is not None:
cv2.imshow("TruGen Avatar", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
runner.stop()
# 4. Start rendering loop (blocks main thread)
if __name__ == "__main__":
runner.run()
cv2.destroyAllWindows()
Features
- 🎥 Real-time Audio/Video Streaming - Receive synchronized high-quality audio and video frames directly from the avatar.
- 🔇 Built-in Acoustic Echo Cancellation (AEC) - Automatic APM synchronization via
enable_audio_output()prevents the avatar from hearing and responding to its own voice. - 🟩 BGR OpenCV Frames - Zero-boilerplate async iterator
video_frames_bgr()yielding pre-converted NumPy arrays ready for OpenCV. - ⚙️ GUI Runner Support - Thread-safe
TruGenRunnerwrapper solves blocking rendering loops in OpenCV, Pygame, PyQt/PySide, or custom game engines. - 🎵 Custom Audio Injection - Programmatically inject WAV files (
upload_audio()) or raw 16-bit PCM bytes (send_audio()) directly into the room. - 🎙️ Microphone Lifecycle Management - Built-in utilities for muting/unmuting the mic and monitoring mic permissions (pending/granted/denied).
- 💬 Real-time Captions - Event hooks to handle caption and transcript updates with zero latency.
- 📝 Clean Transcripts - Distinguish between user transcripts (
user.transcription_received) and agent utterances (agent.transcription_final) for logging. - 📡 Async Iterator API - Stream raw audio (
AudioFrame) and video (VideoFrame) natively via Python async generators. - 🎯 Event-Driven Architecture - Decorator-based event handlers for connection, tracks, speaking states, and transcriptions.
- 📝 Fully Typed - Complete type hints for IDE autocompletion and safety.
API Reference
TruGenClient
The entry point for starting TruGen AI sessions.
from trugen import TruGenClient
# Initialize with your API key
client = TruGenClient(api_key="your-api-key")
# Create a session with your Agent ID
session = await client.create_session(agent_id="your-agent-id")
TruGenSession
Represents an active connection to a streaming room.
Connection
await session.connect(): Connect to the streaming room and publish the local microphone.await session.disconnect(): Disconnect and cleanly release all hardware/stream resources.
Audio Output
await session.enable_audio_output(): Activates speaker playback with built-in echo cancellation. Call this once afterconnect().
Video & Audio Generators
session.video_frames_bgr(): Async generator yielding NumPy arrays (NDArray) in BGR format, ready for OpenCV.session.video_frames(): Async generator yielding raw LiveKitVideoFrameobjects.session.audio_frames(): Async generator yielding raw LiveKitAudioFrameobjects.
Microphone Control
session.mute_input_audio(): Mutes the local microphone.session.unmute_input_audio(): Unmutes the local microphone.session.is_input_muted(): ReturnsTrueif the microphone is muted.session.get_input_audio_state(): Returns anInputAudioStateobject containing mute status and mic permission status (pending,granted,denied).await session.start_mic(): Connects and publishes the microphone track.await session.stop_mic(): Stops capturing and unpublishes the microphone track.
Custom Audio Injection
await session.upload_audio(file_path): Streams a PCM WAV file into the room.await session.send_audio(data, sample_rate=48000, num_channels=1): Injects raw 16-bit PCM bytes into the audio stream.
Low-Level Accessors
session.get_video_track(): Returns the remoteRemoteVideoTrackobject (orNone).session.get_audio_track(): Returns the remoteRemoteAudioTrackobject (orNone).session.room: Returns the underlyinglivekit.rtc.Roominstance for advanced operations.
TruGenRunner
Handles multi-threading to run the async session event loop on a background thread while feeding events safely to the main rendering thread.
Controls
runner.run(): Starts the runner and blocks the main thread to run the rendering loop.runner.stop(): Safely stops the background loop and disconnects the session (thread-safe).runner.toggle_mute(): Toggles the microphone mute state (thread-safe).
Properties & Accessors
runner.mic_muted: ReturnsTrueif the microphone is currently muted.runner.session_state: Returns the current session state enum (TruGenState).runner.session: Access the activeTruGenSessioninstance (returnsNoneuntil connected).runner.get_caption(): Returns a tuple(text, timestamp)containing the last received caption chunk and the monotonic timestamp it arrived.
Event Decorators
You can register event handlers using decorators on the TruGenRunner (for UI/main-thread callbacks) or directly on the TruGenSession (for low-level async callbacks).
1. Runner Decorators (TruGenRunner)
-
@runner.on_frame: Receives BGR video frames (NumPy arrays) orNoneon the main thread.@runner.on_frame def on_frame(frame): if frame is not None: cv2.imshow("Avatar", frame)
-
@runner.on_caption: Receives real-time streaming caption chunks (ideal for UI overlays).@runner.on_caption def on_caption(text: str): # Fired for each caption chunk as it arrives pass
-
@runner.on_state: Called when the session's connection state transitions.@runner.on_state def on_state(state: TruGenState): print(f"Status: {state.value}")
-
@runner.on_event: Handles any standardTruGenEventenum or custom string event.# Log final complete transcripts @runner.on_event("user.transcription_received") def on_user_transcript(text: str): print(f"[User] {text}") @runner.on_event("agent.transcription_final") def on_agent_transcript(text: str): print(f"[Agent] {text}")
2. Session Decorators (TruGenSession)
If you are not using TruGenRunner, you can listen to events directly on the TruGenSession using the @session.on() decorator:
# Log final complete transcripts directly from the session
@session.on("user.transcription_received")
def on_user_speech(text: str):
print(f"[User] {text}")
@session.on("agent.transcription_final")
def on_agent_speech(text: str):
print(f"[Agent] {text}")
# Handle speaking state changes
@session.on(TruGenEvent.AGENT_SPEAKING_STARTED)
def agent_speech_start():
print("Agent started speaking...")
Events (TruGenEvent)
Register listener callbacks directly on a TruGenSession or a TruGenRunner using @session.on() or @runner.on_event().
| Event Enum / String | Fired When | Callback Arguments |
|---|---|---|
TruGenEvent.STATE_CHANGED |
The session state changes | state: TruGenState |
TruGenEvent.CONNECTION_ESTABLISHED |
Successfully connected to room | None |
TruGenEvent.CONNECTION_CLOSED |
Session room disconnected | reason: DisconnectReason |
TruGenEvent.VIDEO_STREAM_STARTED |
Remote video track subscribed | track: RemoteVideoTrack |
TruGenEvent.AUDIO_STREAM_STARTED |
Remote audio track subscribed | track: RemoteAudioTrack |
TruGenEvent.INPUT_AUDIO_STREAM_STARTED |
Local mic audio stream begins publishing | None |
TruGenEvent.AGENT_SPEAKING_STARTED |
Agent starts speaking | None |
TruGenEvent.AGENT_SPEAKING_ENDED |
Agent stops speaking | None |
TruGenEvent.USER_SPEECH_STARTED |
User starts speaking | None |
TruGenEvent.USER_SPEECH_ENDED |
User stops speaking | None |
TruGenEvent.TEXT_CHUNK_RECEIVED |
Caption/Text chunk received | text: str |
TruGenEvent.MIC_PERMISSION_PENDING |
Mic permission request is pending | None |
TruGenEvent.MIC_PERMISSION_GRANTED |
Mic permission has been granted | None |
TruGenEvent.MIC_PERMISSION_DENIED |
Mic permission has been denied | None |
"user.transcription_received" |
User completes a final utterance | text: str |
"agent.transcription_final" |
Agent completes a final utterance | text: str |
"connection.reconnecting" |
Transient network reconnection starts | None |
"connection.reconnected" |
Network reconnection completes | None |
"connection.quality_changed" |
Participant connection quality changes | participant, quality |
TruGenEvent.ERROR |
A session or connection error occurs | error: Exception |
Session States (TruGenState)
| Enum Value | Description |
|---|---|
TruGenState.INITIALIZING |
Session created but not yet connected |
TruGenState.CONNECTING |
WebRTC handshake and connection in progress |
TruGenState.CONNECTED |
Connection established; actively streaming media |
TruGenState.DISCONNECTED |
Session ended and connection closed |
TruGenState.ERROR |
Unrecoverable error occurred |
Detailed Examples
Interactive Session with GUI & WAV Injection
For a fully-featured interactive application showing:
- Real-time video window and connection status.
- Mic mute controls and speaking indicator.
- Floating caption overlay.
- WAV audio injection support (presses
Ato inject a local WAV file to the agent).
See the built-in examples in the directory:
- Basic GUI Viewer - Simple viewer containing status bar, mic indicators, and floating captions.
- Advanced GUI Viewer - Full features demonstration including WAV audio injection, reconnect handles, and complete transcripts.
Configuration
Set the API authentication credentials in a .env file or export them directly in your environment:
export TRUGEN_API_KEY="your-api-key"
export TRUGEN_AGENT_ID="your-agent-id"
Error Handling
Handle exceptions using standard try/except blocks around create_session and connection logic:
import asyncio
from trugen import TruGenClient
client = TruGenClient(api_key="invalid-key")
async def main():
try:
session = await client.create_session(agent_id="my-agent")
await session.connect()
except RuntimeError as e:
print(f"Connection failed: {e}")
except Exception as e:
print(f"An unexpected error occurred: {e}")
asyncio.run(main())
Requirements
- Python 3.10+
- Core Dependencies:
livekit(>=0.11.0)aiohttp(>=3.8.0)
- Optional Dependencies (
[display]):opencv-python(>=4.8.0)sounddevice(>=0.4.6)numpy(>=1.24.0)
License
MIT License - see LICENSE for details.
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trugen_sdk-1.1.0.tar.gz.
File metadata
- Download URL: trugen_sdk-1.1.0.tar.gz
- Upload date:
- Size: 24.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0cb03fcb6df4dbfcdc384b1701d8e597a7c1d07a2d2fd8fabe9c03fd0d55591b
|
|
| MD5 |
fe77066b53a39bc07af8d15c06cf9cb0
|
|
| BLAKE2b-256 |
ec2720c33cd425264060fc5983658f005d560cd65f86984ba3ec61a5f5557507
|
File details
Details for the file trugen_sdk-1.1.0-py3-none-any.whl.
File metadata
- Download URL: trugen_sdk-1.1.0-py3-none-any.whl
- Upload date:
- Size: 23.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
220585e1cd67010c4c50cc8733573aee5e5d9c44801b4570f376691f0c23a28d
|
|
| MD5 |
55ec2aa371ad03b48369b59f0f89fbf1
|
|
| BLAKE2b-256 |
60ddd0cefa17651f285ae0d614b65e3824a35f53ccf4f0f87d6c1345dc24746b
|