Skip to main content

Official Python SDK for TruGen AI - Real-time AI avatar streaming

Project description

TruGen AI Python SDK

Official Python SDK for TruGen AI - Real-time AI avatar streaming.

PyPI version Python 3.10+ License: MIT

All WebRTC and audio/video processing complexity (LiveKit, Acoustic Echo Cancellation, decoding, threading) is handled under the hood. You only ever need to import from trugen.

Installation

# Using uv (recommended)
uv add trugen-sdk

# With optional display utilities (for OpenCV and audio playback testing)
uv add trugen-sdk --extra display

# Using pip
pip install trugen-sdk

# With optional display utilities (for OpenCV and audio playback testing)
pip install trugen-sdk[display]

Quick Start

Simple OpenCV Video Display (Using TruGenRunner)

For most UI/desktop applications, TruGenRunner handles spawning a background event loop thread for the session, while serving BGR video frames and state to the main thread safely.

import cv2
import os
from trugen import TruGenClient, TruGenRunner

# 1. Define how to connect to the session
async def create_session():
    client = TruGenClient(api_key=os.getenv("TRUGEN_API_KEY", ""))
    session = await client.create_session(agent_id=os.getenv("TRUGEN_AGENT_ID", ""))
    await session.connect()
    await session.enable_audio_output()  # Speaker + AEC in one call
    return session

# 2. Initialize the runner
runner = TruGenRunner(session_factory=create_session)

# 3. Handle incoming frames
@runner.on_frame
def show_frame(frame):
    if frame is not None:
        cv2.imshow("TruGen Avatar", frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        runner.stop()

# 4. Start rendering loop (blocks main thread)
if __name__ == "__main__":
    runner.run()
    cv2.destroyAllWindows()

Features

  • 🎥 Real-time Audio/Video Streaming - Receive synchronized high-quality audio and video frames directly from the avatar.
  • 🔇 Built-in Acoustic Echo Cancellation (AEC) - Automatic APM synchronization via enable_audio_output() prevents the avatar from hearing and responding to its own voice.
  • 🟩 BGR OpenCV Frames - Zero-boilerplate async iterator video_frames_bgr() yielding pre-converted NumPy arrays ready for OpenCV.
  • ⚙️ GUI Runner Support - Thread-safe TruGenRunner wrapper solves blocking rendering loops in OpenCV, Pygame, PyQt/PySide, or custom game engines.
  • 🎵 Custom Audio Injection - Programmatically inject WAV files (upload_audio()) or raw 16-bit PCM bytes (send_audio()) directly into the room.
  • 🎙️ Microphone Lifecycle Management - Built-in utilities for muting/unmuting the mic and monitoring mic permissions (pending/granted/denied).
  • 💬 Real-time Captions - Event hooks to handle caption and transcript updates with zero latency.
  • 📝 Clean Transcripts - Distinguish between user transcripts (user.transcription_received) and agent utterances (agent.transcription_final) for logging.
  • 📡 Async Iterator API - Stream raw audio (AudioFrame) and video (VideoFrame) natively via Python async generators.
  • 🎯 Event-Driven Architecture - Decorator-based event handlers for connection, tracks, speaking states, and transcriptions.
  • 📝 Fully Typed - Complete type hints for IDE autocompletion and safety.

API Reference

TruGenClient

The entry point for starting TruGen AI sessions.

from trugen import TruGenClient

# Initialize with your API key
client = TruGenClient(api_key="your-api-key")

# Create a session with your Agent ID
session = await client.create_session(agent_id="your-agent-id")

TruGenSession

Represents an active connection to a streaming room.

Connection

  • await session.connect(): Connect to the streaming room and publish the local microphone.
  • await session.disconnect(): Disconnect and cleanly release all hardware/stream resources.

Audio Output

  • await session.enable_audio_output(): Activates speaker playback with built-in echo cancellation. Call this once after connect().

Video & Audio Generators

  • session.video_frames_bgr(): Async generator yielding NumPy arrays (NDArray) in BGR format, ready for OpenCV.
  • session.video_frames(): Async generator yielding raw LiveKit VideoFrame objects.
  • session.audio_frames(): Async generator yielding raw LiveKit AudioFrame objects.

Microphone Control

  • session.mute_input_audio(): Mutes the local microphone.
  • session.unmute_input_audio(): Unmutes the local microphone.
  • session.is_input_muted(): Returns True if the microphone is muted.
  • session.get_input_audio_state(): Returns an InputAudioState object containing mute status and mic permission status (pending, granted, denied).
  • await session.start_mic(): Connects and publishes the microphone track.
  • await session.stop_mic(): Stops capturing and unpublishes the microphone track.

Custom Audio Injection

  • await session.upload_audio(file_path): Streams a PCM WAV file into the room.
  • await session.send_audio(data, sample_rate=48000, num_channels=1): Injects raw 16-bit PCM bytes into the audio stream.

Low-Level Accessors

  • session.get_video_track(): Returns the remote RemoteVideoTrack object (or None).
  • session.get_audio_track(): Returns the remote RemoteAudioTrack object (or None).
  • session.room: Returns the underlying livekit.rtc.Room instance for advanced operations.

TruGenRunner

Handles multi-threading to run the async session event loop on a background thread while feeding events safely to the main rendering thread.

Controls

  • runner.run(): Starts the runner and blocks the main thread to run the rendering loop.
  • runner.stop(): Safely stops the background loop and disconnects the session (thread-safe).
  • runner.toggle_mute(): Toggles the microphone mute state (thread-safe).

Properties & Accessors

  • runner.mic_muted: Returns True if the microphone is currently muted.
  • runner.session_state: Returns the current session state enum (TruGenState).
  • runner.session: Access the active TruGenSession instance (returns None until connected).
  • runner.get_caption(): Returns a tuple (text, timestamp) containing the last received caption chunk and the monotonic timestamp it arrived.

Event Decorators

You can register event handlers using decorators on the TruGenRunner (for UI/main-thread callbacks) or directly on the TruGenSession (for low-level async callbacks).

1. Runner Decorators (TruGenRunner)
  • @runner.on_frame: Receives BGR video frames (NumPy arrays) or None on the main thread.

    @runner.on_frame
    def on_frame(frame):
        if frame is not None:
            cv2.imshow("Avatar", frame)
    
  • @runner.on_caption: Receives real-time streaming caption chunks (ideal for UI overlays).

    @runner.on_caption
    def on_caption(text: str):
        # Fired for each caption chunk as it arrives
        pass
    
  • @runner.on_state: Called when the session's connection state transitions.

    @runner.on_state
    def on_state(state: TruGenState):
        print(f"Status: {state.value}")
    
  • @runner.on_event: Handles any standard TruGenEvent enum or custom string event.

    # Log final complete transcripts
    @runner.on_event("user.transcription_received")
    def on_user_transcript(text: str):
        print(f"[User]  {text}")
    
    @runner.on_event("agent.transcription_final")
    def on_agent_transcript(text: str):
        print(f"[Agent] {text}")
    
2. Session Decorators (TruGenSession)

If you are not using TruGenRunner, you can listen to events directly on the TruGenSession using the @session.on() decorator:

# Log final complete transcripts directly from the session
@session.on("user.transcription_received")
def on_user_speech(text: str):
    print(f"[User]  {text}")

@session.on("agent.transcription_final")
def on_agent_speech(text: str):
    print(f"[Agent] {text}")

# Handle speaking state changes
@session.on(TruGenEvent.AGENT_SPEAKING_STARTED)
def agent_speech_start():
    print("Agent started speaking...")

Events (TruGenEvent)

Register listener callbacks directly on a TruGenSession or a TruGenRunner using @session.on() or @runner.on_event().

Event Enum / String Fired When Callback Arguments
TruGenEvent.STATE_CHANGED The session state changes state: TruGenState
TruGenEvent.CONNECTION_ESTABLISHED Successfully connected to room None
TruGenEvent.CONNECTION_CLOSED Session room disconnected reason: DisconnectReason
TruGenEvent.VIDEO_STREAM_STARTED Remote video track subscribed track: RemoteVideoTrack
TruGenEvent.AUDIO_STREAM_STARTED Remote audio track subscribed track: RemoteAudioTrack
TruGenEvent.INPUT_AUDIO_STREAM_STARTED Local mic audio stream begins publishing None
TruGenEvent.AGENT_SPEAKING_STARTED Agent starts speaking None
TruGenEvent.AGENT_SPEAKING_ENDED Agent stops speaking None
TruGenEvent.USER_SPEECH_STARTED User starts speaking None
TruGenEvent.USER_SPEECH_ENDED User stops speaking None
TruGenEvent.TEXT_CHUNK_RECEIVED Caption/Text chunk received text: str
TruGenEvent.MIC_PERMISSION_PENDING Mic permission request is pending None
TruGenEvent.MIC_PERMISSION_GRANTED Mic permission has been granted None
TruGenEvent.MIC_PERMISSION_DENIED Mic permission has been denied None
"user.transcription_received" User completes a final utterance text: str
"agent.transcription_final" Agent completes a final utterance text: str
"connection.reconnecting" Transient network reconnection starts None
"connection.reconnected" Network reconnection completes None
"connection.quality_changed" Participant connection quality changes participant, quality
TruGenEvent.ERROR A session or connection error occurs error: Exception

Session States (TruGenState)

Enum Value Description
TruGenState.INITIALIZING Session created but not yet connected
TruGenState.CONNECTING WebRTC handshake and connection in progress
TruGenState.CONNECTED Connection established; actively streaming media
TruGenState.DISCONNECTED Session ended and connection closed
TruGenState.ERROR Unrecoverable error occurred

Detailed Examples

Interactive Session with GUI & WAV Injection

For a fully-featured interactive application showing:

  • Real-time video window and connection status.
  • Mic mute controls and speaking indicator.
  • Floating caption overlay.
  • WAV audio injection support (presses A to inject a local WAV file to the agent).

See the built-in examples in the directory:

  • Basic GUI Viewer - Simple viewer containing status bar, mic indicators, and floating captions.
  • Advanced GUI Viewer - Full features demonstration including WAV audio injection, reconnect handles, and complete transcripts.

Configuration

Set the API authentication credentials in a .env file or export them directly in your environment:

export TRUGEN_API_KEY="your-api-key"
export TRUGEN_AGENT_ID="your-agent-id"

Error Handling

Handle exceptions using standard try/except blocks around create_session and connection logic:

import asyncio
from trugen import TruGenClient

client = TruGenClient(api_key="invalid-key")

async def main():
    try:
        session = await client.create_session(agent_id="my-agent")
        await session.connect()
    except RuntimeError as e:
        print(f"Connection failed: {e}")
    except Exception as e:
        print(f"An unexpected error occurred: {e}")

asyncio.run(main())

Requirements

  • Python 3.10+
  • Core Dependencies:
    • livekit (>=0.11.0)
    • aiohttp (>=3.8.0)
  • Optional Dependencies ([display]):
    • opencv-python (>=4.8.0)
    • sounddevice (>=0.4.6)
    • numpy (>=1.24.0)

License

MIT License - see LICENSE for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trugen_sdk-1.0.0.tar.gz (24.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trugen_sdk-1.0.0-py3-none-any.whl (22.8 kB view details)

Uploaded Python 3

File details

Details for the file trugen_sdk-1.0.0.tar.gz.

File metadata

  • Download URL: trugen_sdk-1.0.0.tar.gz
  • Upload date:
  • Size: 24.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.21

File hashes

Hashes for trugen_sdk-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d7031cdc6e4d7e74bc2c0ab8a9985890c8d85ebc2fdb3642eb3070635d5b210e
MD5 995976ad22b7b31bdfb129e2d907d3cd
BLAKE2b-256 cef5f0c742c97274ca544045fb8730b4d6dd0f1ab4fbfaff26c120642f30efda

See more details on using hashes here.

File details

Details for the file trugen_sdk-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: trugen_sdk-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 22.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.21

File hashes

Hashes for trugen_sdk-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7f2424a3d649e42743ee47dc1093ebd40171960af0a404f9b86c6ebb2a120a65
MD5 041971a438ce3db257a47cab5b18596a
BLAKE2b-256 b437b722ee2a6740ff59188110352e1c57210192ca4026c02d3f84da12b8716d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page