Skip to main content

Official Python SDK for Anam AI - Real-time AI avatar streaming

Project description

Anam AI Python SDK

Official Python SDK for Anam AI - Real-time AI avatar streaming.

PyPI version Python 3.10+ License: MIT

Installation

# Using uv (recommended)
uv add anam-ai

# With optional display utilities (for testing)
uv add anam-ai --extra display

# Using pip
pip install anam-ai

# With optional display utilities (for testing)
pip install anam-ai[display]

Quick Start

import asyncio
from anam import AnamClient
from av.video.frame import VideoFrame
from av.audio.frame import AudioFrame

async def main():
    # Create client with your API key and persona
    client = AnamClient(
        api_key="your-api-key",
        persona_id="your-persona-id",
    )

    # Connect and stream
    async with client.connect() as session:
        print(f"Connected! Session: {session.session_id}")
        
        # Consume video and audio frames concurrently
        async def consume_video():
            async for frame in session.video_frames():
                img = frame.to_ndarray(format="rgb24")  # numpy array (H, W, 3) in RGB format - use "bgr24" for OpenCV
                print(f"Video: {frame.width}x{frame.height}")
        
        async def consume_audio():
            async for frame in session.audio_frames():
                samples = frame.to_ndarray()  # int16 samples (1D array, interleaved for stereo)
                # Determine mono/stereo from frame layout
                channel_type = "mono" if frame.layout.nb_channels == 1 else "stereo"
                print(f"Audio: {samples.size} samples ({channel_type}) @ {frame.sample_rate}Hz")
        
        # Run both streams concurrently until session closes
        await asyncio.gather(
            consume_video(),
            consume_audio(),
        )

asyncio.run(main())

Features

  • 🎥 Real-time Audio/Video streaming - Receive synchronized audio/video frames from the avatar (as PyAV AudioFrame/VideoFrame objects)
  • 💬 Two-way communication - Send text messages (like transcribed user speech) and receive generated responses
  • 🎤 Audio-passthrough - Send TTS generated audio input and receive rendered synchronized audio/video avatar
  • 🗣️ Direct text-to-speech - Send text directly to TTS for immediate speech output (bypasses LLM processing)
  • 🎯 Async iterator API - Clean, Pythonic async/await patterns for continuous stream of audio/video frames
  • 🎯 Event-driven API - Simple decorator-based event handlers for discrete events
  • 📝 Fully typed - Complete type hints for IDE support
  • 🔒 Server-side ready - Designed for server-side Python applications (e.g. for use in a web application)

API Reference

AnamClient

The main client class for connecting to Anam AI.

from anam import AnamClient, PersonaConfig, ClientOptions

# Simple initialization
client = AnamClient(
    api_key="your-api-key",
    persona_id="your-persona-id",
)

# Advanced initialization with full persona config
client = AnamClient(
    api_key="your-api-key",
    persona_config=PersonaConfig(
        persona_id="your-persona-id",
        name="My Assistant",
        system_prompt="You are a helpful assistant...",
        voice_id="emma",
        language_code="en",
    ),
    options=ClientOptions(
        disable_input_audio=True,  # Don't capture microphone
    ),
)

Video and Audio Frames

Frames are PyAV objects (VideoFrame/AudioFrame) containing synchronized decoded audio (PCM) and video (RGB) samples from the avatar, delivered over WebRTC and extracted by aiortc. All PyAV frame attributes are accessible (samples, format, layout, etc.). Access the frames via async iterators and run both iterators concurrently, e.g. using asyncio.gather():

async with client.connect() as session:
    async def process_video():
        async for frame in session.video_frames():
            img = frame.to_ndarray(format="rgb24")  # RGB numpy array
            # Process frame...
    
    async def process_audio():
        async for frame in session.audio_frames():
            samples = frame.to_ndarray()  # int16 samples
            # Process frame...
    
    # Both streams run concurrently
    await asyncio.gather(process_video(), process_audio())

Events

Register callbacks for connection and message events using the @client.on() decorator:

from anam import AnamEvent

@client.on(AnamEvent.MESSAGE_RECEIVED)
async def on_message(message: Message):
    """Called when a chat message is received."""
    print(f"{message.role}: {message.content}")

@client.on(AnamEvent.CONNECTION_ESTABLISHED)
async def on_connected():
    """Called when the connection is established."""
    pass

@client.on(AnamEvent.CONNECTION_CLOSED)
async def on_closed(code: str, reason: str | None):
    """Called when the connection is closed."""
    pass

Session

The Session object is returned by client.connect() and provides methods for interacting with the avatar:

async with client.connect() as session:
    # Send a text message (simulates user speech)
    await session.send_message("Hello, how are you?")
    
    # Interrupt the avatar if speaking
    await session.interrupt()
    
    # Wait until the session ends
    await session.wait_until_closed()

Examples

Save Video and Audio

import cv2
import wave
import asyncio
from anam import AnamClient

client = AnamClient(api_key="...", persona_id="...")

video_writer = cv2.VideoWriter("output.mp4", ...)
audio_writer = wave.open("output.wav", "wb")

async def save_video(session):
    async for frame in session.video_frames():
        # Read frame as BGR for OpenCV VideoWriter
        bgr_frame = frame.to_ndarray(format="bgr24")
        video_writer.write(bgr_frame)

async def save_audio(session):
    async for frame in session.audio_frames():
        # Initialize writer on first frame
        if audio_writer.getnframes() == 0:
            audio_writer.setnchannels(frame.layout.nb_channels)
            audio_writer.setsampwidth(2)  # 16-bit
            audio_writer.setframerate(frame.sample_rate)
        # Write audio data (convert to int16 and get bytes)
        audio_writer.writeframes(frame.to_ndarray().tobytes())

async with client.connect() as session:
    # Record for 30 seconds
    await asyncio.wait_for(
        asyncio.gather(save_video(session), save_audio(session)),
        timeout=30.0,
    )

Display Video with OpenCV

import cv2
import asyncio
from anam import AnamClient

client = AnamClient(api_key="...", persona_id="...")
latest_frame = None

async def update_frame(session):
    global latest_frame
    async for frame in session.video_frames():
        # Read frame as BGR for OpenCV display
        latest_frame = frame.to_ndarray(format="bgr24")

async def main():
    async with client.connect() as session:
        # Start frame consumer
        frame_task = asyncio.create_task(update_frame(session))
        
        # Display loop
        while True:
            if latest_frame is not None:
                cv2.imshow("Avatar", latest_frame)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
        frame_task.cancel()

asyncio.run(main())

Configuration

Environment Variables

export ANAM_API_KEY="your-api-key"
export ANAM_PERSONA_ID="your-persona-id"

Client Options

from anam import ClientOptions

options = ClientOptions(
    api_base_url="https://api.anam.ai",  # API base URL
    api_version="v1",                     # API version
    disable_input_audio=False,            # Disable microphone input
    ice_servers=None,                     # Custom ICE servers
)

Persona Configuration

from anam import PersonaConfig

persona = PersonaConfig(
    persona_id="your-persona-id",    # Required
    name="Assistant",                 # Display name
    avatar_id="anna_v2",             # Avatar to use
    voice_id="emma",                 # Voice to use
    system_prompt="You are...",      # Custom system prompt
    language_code="en",              # Language code
    llm_id="gpt-4",                  # LLM model
    max_session_length_seconds=300,  # Max session duration
)

Error Handling

from anam import AnamError, AuthenticationError, SessionError

try:
    async with client.connect() as session:
        await session.wait_until_closed()
except AuthenticationError as e:
    print(f"Invalid API key: {e}")
except SessionError as e:
    print(f"Session error: {e}")
except AnamError as e:
    print(f"Anam error [{e.code}]: {e.message}")

Requirements

  • Python 3.10+
  • Dependencies are installed automatically:
    • aiortc - WebRTC implementation
    • aiohttp - HTTP client
    • websockets - WebSocket client
    • numpy - Array handling

Optional for display utilities:

  • opencv-python - Video display
  • sounddevice - Audio playback

License

MIT License - see LICENSE for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anam-0.1.0.tar.gz (22.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anam-0.1.0-py3-none-any.whl (27.3 kB view details)

Uploaded Python 3

File details

Details for the file anam-0.1.0.tar.gz.

File metadata

  • Download URL: anam-0.1.0.tar.gz
  • Upload date:
  • Size: 22.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for anam-0.1.0.tar.gz
Algorithm Hash digest
SHA256 81641ad26dbd0838165bfd02695ee10e5d9735ab34f69b4ce9b6a1e500775474
MD5 36de4b8b8a94ad00eaa0c55f07d6e21a
BLAKE2b-256 9163c301246e87748718df9cacd234aaed32dfa5ce73b99aa7529293254fbf11

See more details on using hashes here.

Provenance

The following attestation bundles were made for anam-0.1.0.tar.gz:

Publisher: release-stable.yml on anam-org/python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file anam-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: anam-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for anam-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a1fbf3e32c72d0af40680b81b2788cf84c70289e17ca7e5a294e7f35c7387b94
MD5 c9044de0c2ca891c32005eb9f2bea0ef
BLAKE2b-256 1673af5321f3b37fee6e127c1ed04d400bb735f0bd930d56fb44f85a99b9aac4

See more details on using hashes here.

Provenance

The following attestation bundles were made for anam-0.1.0-py3-none-any.whl:

Publisher: release-stable.yml on anam-org/python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page