Skip to main content

XAI for stream agents

Project description

XAI Plugin for Stream Agents

This package provides xAI (Grok) integration for the Stream Agents ecosystem, enabling you to use xAI's powerful language models in your conversational AI applications.

Features

  • Native xAI SDK Integration: Full access to xAI's chat completion and streaming APIs
  • Conversation Memory: Automatic conversation history management
  • Streaming Support: Real-time response streaming with standardized events
  • Multimodal Support: Handle text and image inputs
  • Event System: Subscribe to response events for custom handling
  • Easy Integration: Drop-in replacement for other LLM providers

Installation

uv add "vision-agents[xai]"
# or directly
uv add vision-agents-plugins-xai

Quick Start

import asyncio
from vision_agents.plugins import xai

async def main():
    # Initialize with your xAI API key
    llm = xai.LLM(
        model="grok-4",
        api_key="your_xai_api_key"  # or set XAI_API_KEY environment variable
    )

    # Simple response
    response = await llm.simple_response("Explain quantum computing in simple terms")

    print(f"\n\nComplete response: {response.text}")

if __name__ == "__main__":
    asyncio.run(main())

Advanced Usage

Conversation with Memory

from vision_agents.plugins import xai

llm = xai.LLM(model="grok-4", api_key="your_api_key")

# First message
await llm.simple_response("My name is Alice and I have 2 cats")

# Second message - the LLM remembers the context
response = await llm.simple_response("How many pets do I have?")
print(response.text)  # Will mention the 2 cats

Using Instructions

llm = LLM(
    model="grok-4",
    api_key="your_api_key"
)

# Create a response with system instructions
response = await llm.create_response(
    input="Tell me about the weather",
    instructions="You are a helpful weather assistant. Always be cheerful and optimistic.",
    stream=True
)

Multimodal Input

# Handle complex multimodal messages
advanced_message = [
    {
        "role": "user",
        "content": [
            {"type": "input_text", "text": "What do you see in this image?"},
            {"type": "input_image", "image_url": "https://example.com/image.jpg"},
        ],
    }
]

messages = LLM._normalize_message(advanced_message)
# Use with your conversation system

API Reference

XAILLM Class

Constructor

LLM(
    model: str = "grok-4",
    api_key: Optional[str] = None,
    client: Optional[AsyncClient] = None
)

Parameters:

  • model: xAI model to use (default: "grok-4")
  • api_key: Your xAI API key (default: reads from XAI_API_KEY environment variable)
  • client: Optional pre-configured xAI AsyncClient

Methods

async simple_response(text: str, processors=None, participant=None)

Generate a simple response to text input.

Parameters:

  • text: Input text to respond to
  • processors: Optional list of processors for video/voice AI context
  • participant: Optional participant object

Returns: LLMResponseEvent[Response] with the generated text

async create_response(input: str, instructions: str = "", model: str = None, stream: bool = True)

Create a response with full control over parameters.

Parameters:

  • input: Input text
  • instructions: System instructions for the model
  • model: Override the default model
  • stream: Whether to stream the response (default: True)

Returns: LLMResponseEvent[Response] with the generated text

Configuration

Environment Variables

  • XAI_API_KEY: Your xAI API key (required if not provided in constructor)

Text-to-Speech (TTS)

The plugin also ships an xai.TTS class powered by xAI's Grok Voice API. It provides five expressive voices with inline speech tags for fine-grained delivery control.

Usage

from vision_agents.plugins import xai

# Default voice (eve) — energetic, upbeat
tts = xai.TTS()

# Specify a voice
tts = xai.TTS(voice="ara")   # warm, friendly
tts = xai.TTS(voice="leo")   # authoritative, strong
tts = xai.TTS(voice="rex")   # confident, clear
tts = xai.TTS(voice="sal")   # smooth, balanced

# Custom output format
tts = xai.TTS(
    voice="rex",
    codec="mp3",
    sample_rate=44100,
    bit_rate=192000,
)

# Explicit API key (otherwise reads XAI_API_KEY env var)
tts = xai.TTS(api_key="xai-your-key-here")

Configuration

Parameter Type Default Description
api_key str env var xAI API key. Falls back to XAI_API_KEY environment variable.
voice str "eve" Voice ID: "eve", "ara", "leo", "rex", or "sal".
language str "en" BCP-47 language code or "auto" for detection.
codec str "pcm" Output codec: "pcm", "mp3", "wav", "mulaw", "alaw".
sample_rate int 24000 Sample rate: 800048000 Hz.
bit_rate int None MP3 bit rate (only used with codec="mp3").
base_url str None Override the xAI TTS API endpoint.
session object None Optional pre-existing aiohttp.ClientSession.

Voices

Voice Tone Best For
eve Energetic, upbeat Demos, announcements, upbeat content (default)
ara Warm, friendly Conversational interfaces, hospitality
leo Authoritative, strong Instructional, educational, healthcare
rex Confident, clear Business, corporate, customer support
sal Smooth, balanced Versatile — works for any context

Speech tags

Add expressiveness to synthesized speech with inline and wrapping tags:

Inline tags (placed where the expression should occur):

  • Pauses: [pause] [long-pause] [hum-tune]
  • Laughter: [laugh] [chuckle] [giggle] [cry]
  • Mouth sounds: [tsk] [tongue-click] [lip-smack]
  • Breathing: [breath] [inhale] [exhale] [sigh]

Wrapping tags (wrap text to change delivery):

  • Volume: <soft>text</soft> <loud>text</loud> <shout>text</shout>
  • Pitch/speed: <high-pitch>text</high-pitch> <low-pitch>text</low-pitch> <slow>text</slow> <fast>text</fast>
  • Style: <whisper>text</whisper> <sing>text</sing>

MP3 output

MP3 decoding requires pydub. Install it via the mp3 extra:

uv add "vision-agents-plugins-xai[mp3]"

Requirements

  • Python 3.10+
  • xai-sdk
  • vision-agents-core
  • Optional: pydub (for MP3 decoding via the mp3 extra)

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_xai-0.5.7.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vision_agents_plugins_xai-0.5.7-py3-none-any.whl (20.6 kB view details)

Uploaded Python 3

File details

Details for the file vision_agents_plugins_xai-0.5.7.tar.gz.

File metadata

  • Download URL: vision_agents_plugins_xai-0.5.7.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_xai-0.5.7.tar.gz
Algorithm Hash digest
SHA256 e0c503340906fa2248dbac13b60b488941266c95211f5be8b71b691c35eb732a
MD5 693e6ce63758ffee1be3949429e91e99
BLAKE2b-256 dde1103d8f5fe5a6b5d2c3a1be9f2d4dd3d0231ed474b7a41993126c9f532f9a

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_xai-0.5.7-py3-none-any.whl.

File metadata

  • Download URL: vision_agents_plugins_xai-0.5.7-py3-none-any.whl
  • Upload date:
  • Size: 20.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_xai-0.5.7-py3-none-any.whl
Algorithm Hash digest
SHA256 0ff7b7033528d239a6dada5de739d0fa6f14e6e0c5d6ca600e9046d547c104f0
MD5 2d154472155c0384dbae1461a0a608a3
BLAKE2b-256 12316226d49422e878e83531f0cd676c13a2945d4f6fd2fac864baea2bb7290d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page