Multi-Modal AI Agent System
Project description
xAgent - Multi-Modal AI Agent System
🚀 A powerful multi-modal AI Agent system with real-time streaming responses
xAgent provides a complete AI assistant experience with text and image processing capabilities, intelligent vocabulary management, high-performance concurrent tool execution, and real-time streaming CLI interface. Built on FastAPI, Streamlit, and Redis for production-ready scalability.
📋 Table of Contents
- 🚀 Installation & Setup
- 🌐 Quick Start: HTTP Agent Server
- 🌐 Web Interface
- 💻 Command Line Interface (CLI)
- 🤖 Advanced Usage: Agent Class
- 🏗️ Architecture
- 🔧 Development Guide
- 🤖 API Reference
- 📊 Monitoring & Observability
- 🤝 Contributing
- 📄 License
🚀 Installation & Setup
Prerequisites
| Requirement | Version | Purpose |
|---|---|---|
| Python | 3.12+ | Core runtime |
| OpenAI API Key | - | AI model access |
Install via pip
pip install myxagent
Environment Configuration
Create a .env file in your project directory:
# Required
OPENAI_API_KEY=your_openai_api_key
# Optional - Redis persistence
REDIS_URL=your_redis_url_with_password
# Optional - Observability
LANGFUSE_SECRET_KEY=your_langfuse_key
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_HOST=https://cloud.langfuse.com
# Optional - Image upload to S3
AWS_ACCESS_KEY_ID=your_aws_access_key_id
AWS_SECRET_ACCESS_KEY=your_aws_secret_access_key
AWS_REGION=us-east-1
BUCKET_NAME=your_bucket_name
you can manually load the .env file into your shell:
export $(cat .env | grep -v '^#' | xargs)
🌐 Quick Start: HTTP Agent Server
The simplest way to use xAgent is through the HTTP server. Just create a config file and start serving!
1. Create Agent Configuration
Create agent_config.yaml:
agent:
name: "MyAgent"
system_prompt: |
You are a helpful assistant. Your task is to assist users with their queries and tasks.
model: "gpt-4.1-mini"
capabilities:
tools:
- "web_search" # Built-in web search
- "draw_image" # Built-in image generation (need set AWS credentials in .env)
- "calculate_square" # Custom tool from my_toolkit
server:
host: "0.0.0.0"
port: 8010
you can also add mcp_servers if you want to use MCP (Model Context Protocol) for dynamic tool loading:
agent:
...
capabilities:
mcp_servers:
- "http://localhost:8001/mcp/" # MCP server URL
...
you can also set use_local_session to false if you want to use Redis for session persistence(need to set REDIS_URL in .env):
agent:
...
use_local_session: false
...
2. Create Custom Tools (Optional)
Create my_toolkit/ directory with __init__.py and your tool functions in script like your_tools.py:
# my_toolkit/__init__.py
from .your_tools import calculate_square, greet_user
# Agent will automatically discover these tools,you can choose which to load in agent config
TOOLKIT_REGISTRY = {
"calculate_square": calculate_square,
"greet_user": greet_user
}
implement your tools in your_tools.py:
# my_toolkit/your_tools.py
from xagent.utils.tool_decorator import function_tool
@function_tool()
def calculate_square(n: int) -> int:
"""Calculate the square of a number."""
return n * n
@function_tool()
def greet_user(name: str) -> str:
"""Greet a user by name."""
return f"Hello, {name}! Nice to meet you."
3. Start the Server
# Start the HTTP Agent Server with default configuration
xagent-server
# With custom configuration and toolkit
xagent-server --config agent_config.yaml --toolkit_path my_toolkit
# Server will be available at http://localhost:8010
4. Use the API
# Simple chat request
curl -X POST "http://localhost:8010/chat" \
-H "Content-Type: application/json" \
-d '{
"user_id": "user123",
"session_id": "session456",
"user_message": "Calculate the square of 15 and greet me as Alice"
}'
# Streaming response
curl -X POST "http://localhost:8010/chat" \
-H "Content-Type: application/json" \
-d '{
"user_id": "user123",
"session_id": "session456",
"user_message": "Hello, how are you?",
"stream": true
}'
5. Advanced Configuration
xAgent supports sophisticated multi-agent architectures and advanced configuration options for complex use cases.
Multi-Agent System with Sub-Agents
Create a hierarchical agent system where a coordinator agent delegates tasks to specialized sub-agents:
Main Agent Configuration (coordinator_agent.yaml):
agent:
name: "Agent"
system_prompt: |
You are Orion, a helpful, concise, and accurate assistant who coordinates specialized agents.
- Always answer clearly and directly.
- When the task requires research, delegate it to the `research_agent`.
- When the task requires writing, editing, or creative content generation, delegate it to the `write_agent`.
- Keep responses focused, relevant, and free of unnecessary filler.
- If more details or clarifications are needed, ask before proceeding.
- Maintain a friendly and professional tone while ensuring efficiency in task delegation.
- Your goal is to act as the central hub, ensuring each request is handled by the most capable resource.
model: "gpt-4.1"
capabilities:
tools:
- "char_count" # 自定义工具示例
mcp_servers:
- "http://localhost:8001/mcp/"
sub_agents:
- name: "research_agent"
description: "Research-focused agent for information gathering and analysis"
server_url: "http://localhost:8011"
- name: "write_agent"
description: "Expert agent for writing tasks, including content creation and editing"
server_url: "http://localhost:8012"
use_local_session: true
server:
host: "0.0.0.0"
port: 8010
Research Specialist (research_agent.yaml):
agent:
name: "Research Agent"
system_prompt: |
You are Tom, a research specialist.
Your role is to gather accurate and up-to-date information using web search, evaluate sources critically, and deliver well-organized, insightful findings.
- Always verify the credibility of your sources.
- Present information in a clear, concise, and structured format.
- Highlight key facts, trends, and supporting evidence.
- When applicable, compare multiple sources to ensure accuracy.
- If information is uncertain or unavailable, state this transparently.
model: "gpt-4.1-mini"
capabilities:
tools:
- "web_search" # built-in web search tool
mcp_servers:
- "http://localhost:8002/mcp/"
use_local_session: true
server:
host: "0.0.0.0"
port: 8011
Writing Specialist (writing_agent.yaml):
agent:
name: "Writing Agent"
system_prompt: |
You are Alice, a professional writer.
Your role is to craft clear, engaging, and well-structured content tailored to the intended audience and purpose.
- Adapt tone, style, and format to match the context.
- Use vivid language and strong storytelling techniques when appropriate.
- Ensure clarity, coherence, and grammatical accuracy.
- Organize ideas logically and maintain a smooth flow.
- Revise and refine content for maximum impact and readability.
model: "gpt-4.1-mini"
capabilities:
tools: []
mcp_servers:
- "http://localhost:8003/mcp/"
use_local_session: true
server:
host: "0.0.0.0"
port: 8012
Starting Multi-Agent System
# Start sub-agents first
xagent-server --config research_agent.yaml > logs/research.log 2>&1 &
xagent-server --config writing_agent.yaml > logs/writing.log 2>&1 &
# Start coordinator agent
xagent-server --config coordinator_agent.yaml --toolkit_path my_toolkit > logs/coordinator.log 2>&1 &
# Verify all agents are running
curl http://localhost:8010/health
curl http://localhost:8011/health
curl http://localhost:8012/health
# Now you can chat with the coordinator agent through its API
curl -X POST "http://localhost:8010/chat" \
-H "Content-Type: application/json" \
-d '{
"user_id": "user123",
"session_id": "session456",
"user_message": "Research the latest advancements in AI and write a summary."
}'
🌐 Web Interface
xAgent provides a user-friendly Streamlit web interface for interactive conversations with your AI agent.
Launch Web Interface
# Start the web interface with default settings
xagent-web
# With custom agent server URL
xagent-web --agent-server http://localhost:8010
# With custom host and port
xagent-web --host 0.0.0.0 --port 8501 --agent-server http://localhost:8010
Web Interface Options
| Option | Description | Default |
|---|---|---|
--agent-server |
URL of the xAgent server | http://localhost:8010 |
--host |
Host address for Streamlit server | 0.0.0.0 |
--port |
Port for Streamlit server | 8501 |
Complete Web Setup Example
# Terminal 1: Start the agent server
xagent-server --config agent_config.yaml --toolkit_path my_toolkit
# Terminal 2: Start the web interface
xagent-web --agent-server http://localhost:8010
# Access the web interface at http://localhost:8501
💻 Command Line Interface (CLI)
xAgent provides a powerful command-line interface for quick interactions and testing. The CLI supports both single-question mode and interactive chat sessions, with real-time streaming responses for a smooth conversational experience.
Note: do not support sub-agents in CLI mode currently.
Quick Start
# Interactive chat mode with streaming (default)
xagent-cli
# Use custom configuration
xagent-cli chat --config my_config.yaml --toolkit_path my_toolkit --user_id developer --session_id session123 --verbose
# Ask a single question (non-streaming)
xagent-cli ask "What is the capital of France?"
Interactive Chat Mode
Start a continuous conversation with the agent with streaming enabled by default:
$ xagent-cli chat
🤖 Welcome to xAgent CLI!
Agent: Agent
Model: gpt-4.1-mini
Tools: 3 loaded
Session: cli_session_abc123
Streaming: Enabled
Type 'exit', 'quit', or 'bye' to end the session.
Type 'clear' to clear the session history.
Type 'stream on/off' to toggle streaming mode.
Type 'help' for available commands.
--------------------------------------------------
👤 You: Hello, how are you?
🤖 Agent: Hello! I'm doing well, thank you for asking...
[Response streams in real-time]
👤 You: help
📋 Available commands:
exit, quit, bye - Exit the chat session
clear - Clear session history
stream on/off - Toggle streaming mode
help - Show this help message
👤 You: exit
👋 Goodbye!
CLI Commands Reference
| Command | Description | Example |
|---|---|---|
xagent-cli |
Start interactive chat with streaming (default) | xagent-cli |
xagent-cli chat |
Start interactive chat explicitly | xagent-cli chat --config my_config.yaml |
xagent-cli ask <message> |
Ask single question (non-streaming) | xagent-cli ask "Hello world" |
CLI Options
| Option | Description | Default |
|---|---|---|
--config |
Configuration file path | config/agent.yaml |
--toolkit_path |
Custom toolkit directory | toolkit |
--user_id |
User identifier | Auto-generated |
--session_id |
Session identifier | Auto-generated |
--verbose, -v |
Enable verbose logging | False |
🤖 Advanced Usage: Agent Class
For more control and customization, use the Agent class directly in your Python code.
Basic Agent Usage
import asyncio
from xagent.core import Agent, Session
async def main():
# Create agent
agent = Agent(
name="my_assistant",
system_prompt="You are a helpful AI assistant.",
model="gpt-4.1-mini"
)
# Create session for conversation management
session = Session(session_id="session456")
# Chat interaction
response = await agent.chat("Hello, how are you?", session)
print(response)
# Streaming response example
response = await agent.chat("Tell me a story", session, stream=True)
async for event in response:
print(event, end="")
asyncio.run(main())
Adding Custom Tools
import asyncio
import time
import httpx
from xagent.utils.tool_decorator import function_tool
from xagent.core import Agent, Session
# Sync tools - automatically converted to async
@function_tool()
def calculate_square(n: int) -> int:
"""Calculate square of a number."""
time.sleep(0.1) # Simulate CPU work
return n * n
# Async tools - used directly for I/O operations
@function_tool()
async def fetch_weather(city: str) -> str:
"""Fetch weather data from API."""
async with httpx.AsyncClient() as client:
await asyncio.sleep(0.5) # Simulate API call
return f"Weather in {city}: 22°C, Sunny"
async def main():
# Create agent with custom tools
agent = Agent(
tools=[calculate_square, fetch_weather],
model="gpt-4.1-mini"
)
session = Session(user_id="user123")
# Agent handles all tools automatically
response = await agent.chat(
"Calculate the square of 15 and get weather for Tokyo",
session
)
print(response)
asyncio.run(main())
Structured Outputs with Pydantic
import asyncio
from pydantic import BaseModel
from xagent.core import Agent, Session
from xagent.tools import web_search
class WeatherReport(BaseModel):
location: str
temperature: int
condition: str
humidity: int
async def get_structured_response():
agent = Agent(model="gpt-4.1-mini", tools=[web_search])
session = Session(user_id="user123")
# Request structured output
weather_data = await agent.chat(
"what's the weather like in Hangzhou?",
session,
output_type=WeatherReport
)
print(f"Location: {weather_data.location}")
print(f"Temperature: {weather_data.temperature}°F")
print(f"Condition: {weather_data.condition}")
print(f"Humidity: {weather_data.humidity}%")
asyncio.run(get_structured_response())
Agent as Tool Pattern
import asyncio
from xagent.core import Agent, Session
from xagent.db import MessageDB
from xagent.tools import web_search
async def agent_as_tool_example():
# Create specialized agents
researcher_agent = Agent(
name="research_specialist",
system_prompt="Research expert. Gather information and provide insights.",
model="gpt-4.1-mini",
tools=[web_search]
)
# Convert agent to tool
message_db = MessageDB()
research_tool = researcher_agent.as_tool(
name="researcher",
description="Research topics and provide detailed analysis",
message_db=message_db
)
# Main coordinator agent with specialist tools
coordinator = Agent(
name="coordinator",
tools=[research_tool],
system_prompt="Coordination agent that delegates to specialists.",
model="gpt-4.1"
)
session = Session(user_id="user123")
# Complex multi-step task
response = await coordinator.chat(
"Research renewable energy benefits and write a brief summary",
session
)
print(response)
asyncio.run(agent_as_tool_example())
Persistent Sessions with Redis
import asyncio
from xagent.core import Agent, Session
from xagent.db import MessageDB
async def chat_with_persistence():
# Initialize Redis-backed message storage
message_db = MessageDB()
# Create agent
agent = Agent(
name="persistent_agent",
model="gpt-4.1-mini"
)
# Create session with Redis persistence
session = Session(
user_id="user123",
session_id="persistent_session",
message_db=message_db
)
# Chat with automatic message persistence
response = await agent.chat("Remember this: my favorite color is blue", session)
print(response)
# Later conversation - context is preserved in Redis
response = await agent.chat("What's my favorite color?", session)
print(response)
asyncio.run(chat_with_persistence())
🏗️ Architecture
Modern Design for High Performance
xAgent/
├── 🤖 xagent/ # Core async agent framework
│ ├── __init__.py # Package initialization and exports
│ ├── __version__.py # Version information
│ ├── core/ # Agent and session management
│ │ ├── __init__.py # Core exports (Agent, Session, HTTPAgentServer)
│ │ ├── agent.py # Main Agent class with chat
│ │ ├── session.py # Session management with operations
│ │ ├── server.py # Standalone HTTP Agent Server
│ │ ├── cli.py # Command line interface
│ │ └── base.py # Base classes and utilities
│ ├── db/ # Database layer (Redis)
│ │ ├── __init__.py # Database exports
│ │ └── message.py # Message persistence
│ ├── schemas/ # Data models and types (Pydantic)
│ │ ├── __init__.py # Schema exports
│ │ └── message.py # Message and ToolCall models
│ ├── tools/ # Tool ecosystem
│ │ ├── __init__.py # Tool registry (web_search, draw_image)
│ │ ├── openai_tool.py # OpenAI tool integrations
│ │ └── mcp_demo/ # MCP demo server and client
│ ├── utils/ # Utility functions
│ ├── multi/ # Multi-agent support
│ │ ├── __init__.py # Multi-agent exports
│ │ ├── swarm.py # Agent swarm coordination
│ │ └── workflow.py # Workflow management
│ └── frontend/ # Web interface components
│ ├── app.py # Streamlit chat application
│ └── launcher.py # Web interface launcher
├── 🛠️ toolkit/ # Custom tool ecosystem
│ ├── __init__.py # Toolkit registry
│ ├── tools.py # Custom tools (char_count)
│ ├── mcp_server.py # Main MCP server
├── ⚙️ config/ # Configuration files
│ ├── agent.yaml # Agent server configuration
│ └── sub_agents_example/ # Sub-agent configuration examples
├── 📝 examples/ # Usage examples and demos
├── 🧪 tests/ # Comprehensive test suite
├── 📁 logs/ # Log files
🔄 Core Components
| Component | Purpose | Technology |
|---|---|---|
| Agent | Core conversation handler | OpenAI API + AsyncIO |
| Session | Message history management | Redis + Operations |
| MessageDB | Scalable persistence layer | Redis with client |
| Tools | Extensible function ecosystem | Auto sync-to-async conversion |
| MCP | Dynamic tool loading protocol | HTTP client |
🛠️ Creating Tools
Both sync and async functions work seamlessly:
from xagent.utils.tool_decorator import function_tool
import asyncio
import time
# ✅ Sync tool - perfect for CPU-bound operations
@function_tool()
def my_sync_tool(input_text: str) -> str:
"""Process text synchronously (runs in thread pool)."""
time.sleep(0.1) # Simulate CPU-intensive work
return f"Sync processed: {input_text}"
# ✅ Async tool - ideal for I/O-bound operations
@function_tool()
async def my_async_tool(input_text: str) -> str:
"""Process text asynchronously."""
await asyncio.sleep(0.1) # Simulate async I/O operation
return f"Async processed: {input_text}"
📋 Tool Development Guidelines
| Use Case | Tool Type | Example |
|---|---|---|
| CPU-bound | Sync functions | Math calculations, data processing |
| I/O-bound | Async functions | API calls, database queries |
| Simple operations | Sync functions | String manipulation, file operations |
| Network requests | Async functions | HTTP requests, WebSocket connections |
⚠️ Note: Recursive functions are not supported as tools due to potential stack overflow issues in async environments.
🔄 Automatic Conversion
xAgent's @function_tool() decorator automatically handles sync-to-async conversion:
- Sync functions → Run in thread pool (non-blocking)
- Async functions → Run directly on event loop
- Concurrent execution → All tools execute in parallel when called
📝 Override Defaults
You can override the default tool name and description using the function_tool decorator:
@function_tool(name="custom_square", description="Calculate the square of a number")
def calculate_square(n: int) -> int:
return n * n
🤖 API Reference
Core Classes
🤖 Agent
Main AI agent class for handling conversations and tool execution.
Agent(
name: Optional[str] = None,
system_prompt: Optional[str] = None,
model: Optional[str] = None,
client: Optional[AsyncOpenAI] = None,
tools: Optional[list] = None,
mcp_servers: Optional[str | list] = None,
sub_agents: Optional[List[Union[tuple[str, str, str], 'Agent']]] = None
)
Key Methods:
async chat(user_message, session, **kwargs) -> str | BaseModel: Main chat interfaceasync __call__(user_message, session, **kwargs) -> str | BaseModel: Shorthand for chatas_tool(name, description, message_db) -> Callable: Convert agent to tool
Parameters:
name: Agent identifier (default: "default_agent")system_prompt: Instructions for the agent behaviormodel: OpenAI model to use (default: "gpt-4.1-mini")client: Custom AsyncOpenAI client instancetools: List of function toolsmcp_servers: MCP server URLs for dynamic tool loadingsub_agents: List of sub-agent configurations (name, description, server URL)
💬 Session
Manages conversation history and persistence with operations.
Session(
user_id: str,
session_id: Optional[str] = None,
message_db: Optional[MessageDB] = None
)
Key Methods:
async add_messages(messages: Message | List[Message]) -> None: Store messagesasync get_messages(count: int = 20) -> List[Message]: Retrieve message historyasync clear_session() -> None: Clear conversation historyasync pop_message() -> Optional[Message]: Remove last non-tool message
🗄️ MessageDB
Redis-backed message persistence layer.
# Initialize with environment variables or defaults
message_db = MessageDB()
# Usage with session
session = Session(
user_id="user123",
message_db=message_db
)
Important Considerations
| Aspect | Details |
|---|---|
| Tool functions | Can be sync or async (automatic conversion) |
| Agent interactions | Always use await |
| Context | Run in context with asyncio.run() |
| Concurrency | All tools execute in parallel automatically |
📊 Monitoring & Observability
xAgent includes comprehensive observability features:
- 🔍 Langfuse Integration - Track AI interactions and performance
- 📝 Structured Logging - Throughout the entire system
- ❤️ Health Checks - API monitoring endpoints
- ⚡ Performance Metrics - Tool execution time and success rates
🤝 Contributing
We welcome contributions! Here's how to get started:
Development Workflow
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
Development Guidelines
| Area | Requirements |
|---|---|
| Code Style | Follow PEP 8 standards |
| Testing | Add tests for new features |
| Documentation | Update docs as needed |
| Type Safety | Use type hints throughout |
| Commits | Follow conventional commit messages |
Package Upload
First time upload
pip install build twine
python -m build
twine upload dist/*
Subsequent uploads
rm -rf dist/ build/ *.egg-info/
python -m build
twine upload dist/*
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
Special thanks to the amazing open source projects that make xAgent possible:
- OpenAI - GPT models powering our AI
- FastAPI - Robust async API framework
- Streamlit - Intuitive web interface
- Redis - High-performance data storage
- Langfuse - Observability and monitoring
📞 Support & Community
| Resource | Link | Purpose |
|---|---|---|
| 🐛 Issues | GitHub Issues | Bug reports & feature requests |
| 💬 Discussions | GitHub Discussions | Community chat & Q&A |
| zhangjun310@live.com | Direct support |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file myxagent-0.1.12.tar.gz.
File metadata
- Download URL: myxagent-0.1.12.tar.gz
- Upload date:
- Size: 49.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
307deabdf7d7ff1ea83e48f7f530018bc9310f36f923f3b45debc54371559ff7
|
|
| MD5 |
64ad003574b0026a0eac9f8d1b8503cb
|
|
| BLAKE2b-256 |
7e12b5807258a189c0a2bd50bb298b1d1f43a5da9972606c576d68281a7e30c1
|
File details
Details for the file myxagent-0.1.12-py3-none-any.whl.
File metadata
- Download URL: myxagent-0.1.12-py3-none-any.whl
- Upload date:
- Size: 54.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc2ec56b627c9998864669bf88854caa562c173597dff4acfec6b91d0b6b059d
|
|
| MD5 |
b64df5ff2a266a03ca190b84dc4f0408
|
|
| BLAKE2b-256 |
8195021fd2945b01789a4786bae1c2b265dc78f7982b5b77e731522eeb1899d4
|