Python SDK for avatar WebSocket services with audio streaming and animation frame reception
Project description
Avatar SDK Python
A Python SDK for connecting to avatar services via WebSocket, supporting audio streaming and receiving animation frames.
Quick Start
import asyncio
from datetime import datetime, timedelta, timezone
from avatarkit import new_avatar_session
async def main():
# Create session
session = new_avatar_session(
api_key="your-api-key",
app_id="your-app-id",
console_endpoint_url="https://console.us-west.spatialwalk.cloud/v1/console",
ingress_endpoint_url="wss://api.us-west.spatialwalk.cloud/v2/driveningress",
avatar_id="your-avatar-id",
expire_at=datetime.now(timezone.utc) + timedelta(minutes=5),
transport_frames=lambda frame, last: print(f"Received frame: {len(frame)} bytes"),
on_error=lambda err: print(f"Error: {err}"),
on_close=lambda: print("Session closed")
)
# Initialize and connect
await session.init()
connection_id = await session.start()
print(f"Connected: {connection_id}")
# Send audio
audio_data = b"..." # Your PCM audio data
request_id = await session.send_audio(audio_data, end=True)
print(f"Sent audio: {request_id}")
# Wait for frames...
await asyncio.sleep(10)
# Close
await session.close()
if __name__ == "__main__":
asyncio.run(main())
Detailed Usage
Session Configuration
The SDK provides two ways to configure a session:
Option 1: Using new_avatar_session() (Recommended)
from avatarkit import new_avatar_session
session = new_avatar_session(
avatar_id="avatar-123",
api_key="your-api-key",
app_id="your-app-id",
# For web-style auth, set use_query_auth=True to put (appId, sessionKey)
# in the websocket URL query params instead of headers.
use_query_auth=False,
expire_at=datetime.now(timezone.utc) + timedelta(minutes=5),
console_endpoint_url="https://console.us-west.spatialwalk.cloud/v1/console",
ingress_endpoint_url="wss://api.us-west.spatialwalk.cloud/v2/driveningress",
sample_rate=16000, # Default: 16000 Hz
transport_frames=on_frame_received,
on_error=on_error,
on_close=on_close
)
Option 2: Using Configuration Builder
from avatarkit import SessionConfigBuilder, AvatarSession
config = (SessionConfigBuilder()
.with_avatar_id("avatar-123")
.with_api_key("your-api-key")
.with_app_id("your-app-id")
.with_console_endpoint_url("https://console.us-west.spatialwalk.cloud/v1/console")
.with_ingress_endpoint_url("wss://api.us-west.spatialwalk.cloud/v2/driveningress")
.with_expire_at(datetime.now(timezone.utc) + timedelta(minutes=5))
.with_transport_frames(on_frame_received)
.build())
session = AvatarSession(config)
Session Lifecycle
# 1. Initialize (get session token)
await session.init()
# 2. Start WebSocket connection
connection_id = await session.start()
# 3. Send audio data
request_id = await session.send_audio(audio_bytes, end=True)
# 4. Receive frames via callback
# (automatically handled in background)
# 5. Close session
await session.close()
Audio Format
The SDK currently supports mono 16-bit PCM (s16le) audio:
- Sample Rate: one of
[8000, 16000, 22050, 24000, 32000, 44100, 48000] - Channels: 1 (mono)
- Bit Depth: 16-bit
- Format: Raw PCM bytes
# Example: Load PCM audio file
with open("audio.pcm", "rb") as f:
audio_data = f.read()
# Send in chunks or all at once
await session.send_audio(audio_data, end=True)
LiveKit Egress Mode
When configured with livekit_egress, audio and animation data are streamed to a LiveKit room via the egress service instead of being returned through the WebSocket connection.
from avatarkit import new_avatar_session, LiveKitEgressConfig
session = new_avatar_session(
avatar_id="avatar-123",
api_key="your-api-key",
app_id="your-app-id",
console_endpoint_url="https://console.us-west.spatialwalk.cloud/v1/console",
ingress_endpoint_url="wss://api.us-west.spatialwalk.cloud/v2/driveningress",
expire_at=datetime.now(timezone.utc) + timedelta(minutes=5),
livekit_egress=LiveKitEgressConfig(
url="wss://livekit.example.com",
api_key="livekit-api-key",
api_secret="livekit-api-secret",
room_name="my-room",
publisher_id="avatar-publisher",
),
)
When LiveKit egress is enabled:
- The server streams output to the specified LiveKit room
- The
transport_framescallback will not be invoked - Audio and animation data are published to the room under the specified publisher ID
Interrupt (LiveKit Egress Only)
The interrupt() method sends an interrupt signal to stop current audio processing. This is only available when using LiveKit egress mode.
# Send audio
request_id = await session.send_audio(audio_data, end=True)
# Later, if you need to interrupt (e.g., user wants to stop playback)
interrupted_id = await session.interrupt()
print(f"Interrupted request: {interrupted_id}")
The interrupt uses the most recent request ID, even after end=True was sent. This allows interrupting requests that have finished sending audio but are still being processed by the server.
Callbacks
Transport Frames Callback
Receives animation frames from the server:
def on_frame_received(frame_data: bytes, is_last: bool):
print(f"Received frame: {len(frame_data)} bytes")
if is_last:
print("This is the last frame")
# Process frame_data (contains serialized Message protobuf)
Error Callback
Handles errors from the session:
def on_error(error: Exception):
print(f"Session error: {error}")
Close Callback
Called when the session closes:
def on_close():
print("Session has been closed")
API Reference
AvatarSession
Main class for managing avatar sessions.
Methods
async init()- Initialize session and obtain tokenasync start() -> str- Start WebSocket connection, returns connection IDasync send_audio(audio: bytes, end: bool = False) -> str- Send audio data, returns request IDasync interrupt() -> str- Interrupt current audio processing (LiveKit egress mode only), returns interrupted request IDasync close()- Close the session and clean up resourcesconfig -> SessionConfig- Get session configuration (property)
SessionConfig
Configuration dataclass for avatar sessions.
Fields
avatar_id: str- Avatar identifierapi_key: str- API key for authenticationapp_id: str- Application identifieruse_query_auth: bool- Send websocket auth via query params (web) instead of headers (mobile)expire_at: datetime- Session expiration timesample_rate: int- Audio sample rate (default: 16000)bitrate: int- Audio bitrate (default: 0; PCM typically uses 0)transport_frames: Callable[[bytes, bool], None]- Frame callbackon_error: Callable[[Exception], None]- Error callbackon_close: Callable[[], None]- Close callbackconsole_endpoint_url: str- Console API URLingress_endpoint_url: str- Ingress WebSocket URLlivekit_egress: Optional[LiveKitEgressConfig]- LiveKit egress configuration
LiveKitEgressConfig
Configuration for streaming to a LiveKit room.
Fields
url: str- LiveKit server URL (e.g.,wss://livekit.example.com)api_key: str- LiveKit API keyapi_secret: str- LiveKit API secretroom_name: str- LiveKit room name to joinpublisher_id: str- Publisher identity in the roomextra_attributes: dict[str, str]- Extra LiveKit participant attributesidle_timeout: int- Idle timeout in seconds (0 uses server defaults)
SessionConfigBuilder
Builder for constructing SessionConfig with fluent interface.
Methods
All methods return self for chaining:
with_avatar_id(avatar_id: str)with_api_key(api_key: str)with_app_id(app_id: str)with_use_query_auth(use_query_auth: bool)with_expire_at(expire_at: datetime)with_sample_rate(sample_rate: int)with_bitrate(bitrate: int)with_transport_frames(handler: Callable)with_on_error(handler: Callable)with_on_close(handler: Callable)with_console_endpoint_url(url: str)with_ingress_endpoint_url(url: str)with_livekit_egress(config: LiveKitEgressConfig)build() -> SessionConfig- Build the configuration
Utility Functions
generate_log_id() -> str- Generate unique log ID in format "YYYYMMDDHHMMSS_<nanoid>"
Exceptions
SessionTokenError- Raised when session token request fails
Examples
See the examples directory for complete working examples:
- single_audio_clip - Basic usage with a single audio file
- http_service - Simple HTTP API that returns PCM audio (by sample rate) and generated animation Message binaries
Protocol Buffers
The SDK uses Protocol Buffers for efficient serialization. The proto definitions are in proto/message.proto.
Generating Proto Code
Proto code is generated using buf:
cd proto
buf generate
The generated Python code is placed in src/avatarkit/proto/generated/.
Message Types
MESSAGE_CLIENT_CONFIGURE_SESSION(1) - Client session negotiation parametersMESSAGE_SERVER_CONFIRM_SESSION(2) - Server confirms and returnsconnection_idMESSAGE_CLIENT_AUDIO_INPUT(3) - Client audio inputMESSAGE_SERVER_ERROR(4) - Server-side error messageMESSAGE_SERVER_RESPONSE_ANIMATION(5) - Server animation response (endindicates final)MESSAGE_CLIENT_INTERRUPT(7) - Client interrupt signal to stop processing
Development
Setup
# Install uv if not already installed
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and setup
git clone <repository-url>
cd avatar-sdk-python
uv sync
License
See LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file avatarkit-0.1.4.tar.gz.
File metadata
- Download URL: avatarkit-0.1.4.tar.gz
- Upload date:
- Size: 13.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.5 {"installer":{"name":"uv","version":"0.10.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ddd7691fd5130e237f5ee3fa7ebcdd7dad1d052c6363ba5ed872f04f08633d4c
|
|
| MD5 |
91b5e88f8933ca2cd701cc943105acea
|
|
| BLAKE2b-256 |
ec0a022230453f0d95bf273989b6494cb3122230620af997e8a3524c2938ccbc
|
File details
Details for the file avatarkit-0.1.4-py3-none-any.whl.
File metadata
- Download URL: avatarkit-0.1.4-py3-none-any.whl
- Upload date:
- Size: 16.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.5 {"installer":{"name":"uv","version":"0.10.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1770f4d71d93259df9f05eddce3c0f64ff55f4afa591267bd5e9733d8d41ad93
|
|
| MD5 |
3000709e06426983b1a1df033fe81a49
|
|
| BLAKE2b-256 |
2d0213c89a6395ab84d9f12ab0cfc96613a6ce40b52b2b8366dfa142d2acaec9
|