Multi-Modal AI Agent System
Project description
xAgent - Multi-Modal AI Agent System
๐ A powerful and easy-to-use AI Agent system with real-time streaming responses
xAgent provides a complete AI assistant experience with text and image processing, tool execution, HTTP server, web interface, and streaming CLI. Perfect for both quick prototypes and production deployments.
๐ Table of Contents
- ๐ Quick Start
- ๐ง Installation
- ๐ HTTP Server
- ๐ Web Interface
- ๐ป Command Line Interface
- ๐ค Python API
- ๐ Multi-Agent Workflows
- ๐ Documentation
- ๐ค Contributing
- ๐ License
๐ Quick Start
Get up and running with xAgent in just a few commands:
# Install xAgent
pip install myxagent
# Set your OpenAI API key
export OPENAI_API_KEY=your_openai_api_key
# Start interactive CLI chat
xagent-cli
# Or quickly ask a question
xagent-cli --ask "who are you?"
xagent-cli --ask "what's the weather in Hangzhou and Shanghai?" -v
That's it!
HTTP Server
The easiest way to deploy xAgent in production is through the HTTP server. Once you start the server, you can interact with your AI agent via HTTP API.
# Start the HTTP server
xagent-server
# Server runs at http://localhost:8010
Chat via HTTP API
curl -X POST "http://localhost:8010/chat" \
-H "Content-Type: application/json" \
-d '{
"user_id": "user123",
"session_id": "session456",
"user_message": "Hello, how are you?"
}'
Or launch web interface
xagent-web
๐ง Installation
Prerequisites
- Python 3.12+
- OpenAI API Key
Install from PyPI
# Latest version
pip install myxagent
# Upgrade existing installation
pip install --upgrade myxagent
# Using different mirrors
pip install myxagent -i https://pypi.org/simple
pip install myxagent -i https://mirrors.aliyun.com/pypi/simple # China users
Environment Setup
Create a .env file in your project directory:
# Required
OPENAI_API_KEY=your_openai_api_key
# Redis persistence
REDIS_URL=your_redis_url_with_password
# Observability
LANGFUSE_SECRET_KEY=your_langfuse_key
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_HOST=https://cloud.langfuse.com
# Image upload to S3
AWS_ACCESS_KEY_ID=your_aws_access_key_id
AWS_SECRET_ACCESS_KEY=your_aws_secret_access_key
AWS_REGION=us-east-1
BUCKET_NAME=your_bucket_name
๐ HTTP Server
Here's how to run the xAgent HTTP server:
Quick Start
# Start with default settings
xagent-server
# Server runs at http://localhost:8010
Basic Configuration
Create agent_config.yaml:
agent:
name: "MyAgent"
system_prompt: "You are a helpful assistant"
model: "gpt-4.1-mini"
capabilities:
tools:
- "web_search" # Built-in web search
- "draw_image" # Built-in image generation (need set aws credentials for image upload)
- "custom_tool" # Your custom tools
server:
host: "0.0.0.0"
port: 8010
more advanced configurations can be found in Configuration Reference.
Custom Tools
Initialize your project with default toolkit structure:
# Create default config and toolkit structure
xagent-cli --init
This creates:
config/agent.yaml- Configuration filemy_toolkit/- Custom tools directory with examples
Example toolkit structure:
# my_toolkit/__init__.py
from .tools import *
TOOLKIT_REGISTRY = {
"calculate_square": calculate_square,
"fetch_weather": fetch_weather
}
# my_toolkit/tools.py
import asyncio
from xagent.utils.tool_decorator import function_tool
@function_tool()
def calculate_square(n: int) -> int:
"""Calculate the square of a number."""
return n * n
@function_tool()
async def fetch_weather(city: str) -> str:
"""Fetch weather data from an API."""
# Simulate API call
await asyncio.sleep(0.5)
return f"Weather in {city}: 22ยฐC, Sunny"
Start with Custom Config
# Use the generated config and toolkit
xagent-server --config config/agent.yaml --toolkit_path my_toolkit
API Usage
# Simple chat
curl -X POST "http://localhost:8010/chat" \
-H "Content-Type: application/json" \
-d '{
"user_id": "user123",
"session_id": "session456",
"user_message": "Calculate the square of 15"
}'
# Streaming response
curl -X POST "http://localhost:8010/chat" \
-H "Content-Type: application/json" \
-d '{
"user_id": "user123",
"session_id": "session456",
"user_message": "Tell me a story",
"stream": true
}'
For advanced configurations including multi-agent systems and structured outputs, see Configuration Reference.
๐ Web Interface
User-friendly Streamlit chat interface for interactive conversations with your AI agent.
# Start the chat interface with default settings
xagent-web
# With custom agent server URL
xagent-web --agent-server http://localhost:8010
# With custom host and port
xagent-web --host 0.0.0.0 --port 8501 --agent-server http://localhost:8010
๐ป Command Line Interface
Interactive CLI with real-time streaming for quick testing and development.
Basic Usage
# Interactive chat mode (default with streaming)
xagent-cli
# Ask a single question
xagent-cli --ask "What is the capital of France?"
# Use custom configuration
xagent-cli --config my_config.yaml --toolkit_path my_toolkit
Interactive Chat Session
$ xagent-cli
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ ๐ค Welcome to xAgent CLI! โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
๐ Config: Default configuration
๐ค Agent: Agent
๐ง Model: gpt-4o-mini
๐ ๏ธ Tools: 1 loaded
๐ Session: cli_session_d02daf21
โ๏ธ Status: ๐ Verbose: Off | ๐ Stream: On
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ Quick Start:
โข Type your message to chat with the agent
โข Use 'help' to see all available commands
โข Use 'exit', 'quit', or 'bye' to end session
โข Use 'clear' to reset conversation history
โข Use 'stream on/off' to toggle response streaming
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ค You: Hello, how are you?
๐ค Agent: Hello! I'm doing well, thank you for asking...
๐ค You: help
โญโ ๐ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ exit, quit, bye Exit the chat session โ
โ clear Clear conversation history โ
โ stream on/off Toggle streaming response mode โ
โ help Show this help message โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ ๐ง Built-in Tools โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ 1. web_search โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
๐ค You: exit
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ ๐ Thank you for using xAgent CLI! โ
โ See you next time! ๐ โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
CLI Options
| Option | Description | Example |
|---|---|---|
--config |
Configuration file path | --config my_config.yaml |
--toolkit_path |
Custom toolkit directory | --toolkit_path my_toolkit |
--user_id |
User ID for session | --user_id user123 |
--session_id |
Session ID for chat | --session_id session456 |
--ask |
Ask single question and exit | --ask "Hello world" |
--init |
Create default config and toolkit | --init |
--verbose, -v |
Enable verbose logging | --verbose |
๐ค Python API
Use xAgent directly in your Python applications with full control and customization.
Basic Usage
import asyncio
from xagent.core import Agent
async def main():
# Create agent
agent = Agent(
name="my_assistant",
system_prompt="You are a helpful AI assistant.",
model="gpt-4.1-mini"
)
# Chat interaction
response = await agent.chat(
user_message="Hello, how are you?",
user_id="user123",
session_id="session456"
)
print(response)
asyncio.run(main())
Streaming Responses
async def streaming_example():
agent = Agent()
response = await agent.chat(
user_message="Tell me a story",
user_id="user123",
session_id="session456",
stream=True
)
async for chunk in response:
print(chunk, end="")
asyncio.run(streaming_example())
Adding Custom Tools
import asyncio
import time
import httpx
from xagent.utils.tool_decorator import function_tool
from xagent.core import Agent
# Sync tools - automatically converted to async
@function_tool()
def calculate_square(n: int) -> int:
"""Calculate square of a number."""
time.sleep(0.1) # Simulate CPU work
return n * n
# Async tools - used directly for I/O operations
@function_tool()
async def fetch_weather(city: str) -> str:
"""Fetch weather data from API."""
async with httpx.AsyncClient() as client:
await asyncio.sleep(0.5) # Simulate API call
return f"Weather in {city}: 22ยฐC, Sunny"
async def main():
# Create agent with custom tools
agent = Agent(
tools=[calculate_square, fetch_weather],
model="gpt-4.1-mini"
)
# Agent handles all tools automatically
response = await agent.chat(
user_message="Calculate the square of 15 and get weather for Tokyo",
user_id="user123",
session_id="session456"
)
print(response)
asyncio.run(main())
Structured Outputs
With Pydantic structured outputs, you can:
- Parse and validate an agentโs response into typed data
- Easily extract specific fields
- Ensure the response matches the expected format
- Guarantee type safety in your application
- Reliably chain multi-step tasks using structured data
import asyncio
from pydantic import BaseModel
from xagent.core import Agent
from xagent.tools import web_search
class WeatherReport(BaseModel):
location: str
temperature: int
condition: str
humidity: int
class Step(BaseModel):
explanation: str
output: str
class MathReasoning(BaseModel):
steps: list[Step]
final_answer: str
async def get_structured_response():
agent = Agent(model="gpt-4.1-mini",
tools=[web_search],
output_type=WeatherReport) # You can set a default output type here or leave it None
# Request structured output for weather
weather_data = await agent.chat(
user_message="what's the weather like in Hangzhou?",
user_id="user123",
session_id="session456"
)
print(f"Location: {weather_data.location}")
print(f"Temperature: {weather_data.temperature}ยฐF")
print(f"Condition: {weather_data.condition}")
print(f"Humidity: {weather_data.humidity}%")
# Request structured output for mathematical reasoning (overrides output_type)
reply = await agent.chat(
user_message="how can I solve 8x + 7 = -23",
user_id="user123",
session_id="session456",
output_type=MathReasoning
) # Override output_type for this call
for index, step in enumerate(reply.steps):
print(f"Step {index + 1}: {step.explanation} => Output: {step.output}")
print("Final Answer:", reply.final_answer)
if __name__ == "__main__":
asyncio.run(get_structured_response())
Agent as Tool Pattern
import asyncio
from xagent.core import Agent
from xagent.components import MessageStorageLocal
from xagent.tools import web_search
async def agent_as_tool_example():
# Create specialized agents with message storage
message_storage = MessageStorageLocal()
researcher_agent = Agent(
name="research_specialist",
system_prompt="Research expert. Gather information and provide insights.",
model="gpt-4.1-mini",
tools=[web_search],
message_storage=message_storage
)
# Convert agent to tool
research_tool = researcher_agent.as_tool(
name="researcher",
description="Research topics and provide detailed analysis"
)
# Main coordinator agent with specialist tools
coordinator = Agent(
name="coordinator",
tools=[research_tool],
system_prompt="Coordination agent that delegates to specialists.",
model="gpt-4.1",
message_storage=message_storage
)
# Complex multi-step task
response = await coordinator.chat(
user_message="Research renewable energy benefits and write a brief summary",
user_id="user123",
session_id="session456"
)
print(response)
asyncio.run(agent_as_tool_example())
Persistent Sessions with Redis
import asyncio
from xagent.core import Agent
from xagent.components import MessageStorageRedis
async def chat_with_persistence():
# Initialize Redis-backed message storage
message_storage = MessageStorageRedis()
# Create agent with Redis persistence
agent = Agent(
name="persistent_agent",
model="gpt-4.1-mini",
message_storage=message_storage
)
# Chat with automatic message persistence
response = await agent.chat(
user_message="Remember this: my favorite color is blue",
user_id="user123",
session_id="persistent_session"
)
print(response)
# Later conversation - context is preserved in Redis
response = await agent.chat(
user_message="What's my favorite color?",
user_id="user123",
session_id="persistent_session"
)
print(response)
asyncio.run(chat_with_persistence())
you can implement your own message storage by inheriting from MessageStorageBase and implementing the required methods like add_messages, get_messages, etc.
๐ Multi-Agent Workflows
xAgent enables powerful multi-agent coordination patterns for complex tasks.
Workflow Patterns
| Pattern | Use Case | Example |
|---|---|---|
| Sequential | Pipeline processing | Research -> Analysis -> Summary |
| Parallel | Multiple perspectives | Expert panel reviews |
| Graph | Complex dependencies | A->B, A->C, B&C->D |
| Hybrid | Multi-stage workflows | Sequential + Parallel + Graph stages |
Quick Example
import asyncio
from xagent import Agent
from xagent.multi.workflow import Workflow
async def workflow_example():
# Create specialized agents with detailed expertise
market_researcher = Agent(
name="MarketResearcher",
system_prompt="""You are a senior market research analyst with 10+ years of experience.
Your expertise includes:
- Industry trend analysis and forecasting
- Competitive landscape assessment
- Market size estimation and growth projections
- Consumer behavior analysis
- Technology adoption patterns
Always provide data-driven insights with specific metrics, sources, and actionable recommendations.""",
model="gpt-4o"
)
data_scientist = Agent(
name="DataScientist",
system_prompt="""You are a senior data scientist specializing in business intelligence and predictive analytics.
Your core competencies:
- Statistical analysis and hypothesis testing
- Predictive modeling and machine learning
- Data visualization and storytelling
- Risk assessment and scenario planning
- Performance metrics and KPI development
Transform raw research into quantitative insights, identify patterns, and build predictive models.""",
model="gpt-4o"
)
business_writer = Agent(
name="BusinessWriter",
system_prompt="""You are an executive business writer and strategic communications expert.
Your specializations:
- Executive summary creation
- Strategic recommendation development
- Stakeholder communication
- Risk and opportunity assessment
- Implementation roadmap design
Create compelling, actionable business reports that drive decision-making at the C-level.""",
model="gpt-4o"
)
financial_analyst = Agent(
name="FinancialAnalyst",
system_prompt="""You are a CFA-certified financial analyst with expertise in valuation and investment analysis.
Your focus areas:
- Financial modeling and valuation
- Investment risk assessment
- ROI and NPV calculations
- Capital allocation strategies
- Market opportunity sizing
Provide detailed financial analysis with concrete numbers, projections, and investment recommendations.""",
model="gpt-4o"
)
strategy_consultant = Agent(
name="StrategyConsultant",
system_prompt="""You are a senior strategy consultant from a top-tier consulting firm.
Your expertise includes:
- Strategic planning and execution
- Business model innovation
- Competitive strategy development
- Organizational transformation
- Change management
Synthesize complex information into clear strategic recommendations with implementation timelines.""",
model="gpt-4o"
)
workflow = Workflow()
# Sequential workflow - Research to Analysis to Report Pipeline
result = await workflow.run_sequential(
agents=[market_researcher, data_scientist, business_writer],
task="Analyze the electric vehicle charging infrastructure market opportunity in North America for 2024-2027"
)
print("Sequential Pipeline Result:", result.result)
# Parallel workflow - Multiple expert perspectives on same challenge
result = await workflow.run_parallel(
agents=[financial_analyst, strategy_consultant, data_scientist],
task="Evaluate the investment potential and strategic implications of generative AI adoption in enterprise software companies"
)
print("Expert Panel Analysis:", result.result)
# Graph workflow - Complex dependency analysis
dependencies = "MarketResearcher->DataScientist, MarketResearcher->FinancialAnalyst, DataScientist&FinancialAnalyst->StrategyConsultant, StrategyConsultant->BusinessWriter"
result = await workflow.run_graph(
agents=[market_researcher, data_scientist, financial_analyst, strategy_consultant, business_writer],
dependencies=dependencies,
task="Develop a comprehensive go-to-market strategy for a B2B SaaS startup entering the healthcare analytics space"
)
print("Strategic Analysis Result:", result.result)
# Hybrid workflow - Multi-stage comprehensive business analysis
quality_reviewer = Agent(
name="QualityReviewer",
system_prompt="""You are a senior partner-level consultant specializing in quality assurance and risk management.
Your responsibilities:
- Strategic recommendation validation
- Risk identification and mitigation
- Stakeholder impact assessment
- Implementation feasibility review
- Compliance and regulatory considerations
Ensure all recommendations are practical, well-researched, and aligned with business objectives.""",
model="gpt-4o"
)
stages = [
{
"pattern": "sequential",
"agents": [market_researcher, financial_analyst],
"task": "Conduct market and financial analysis for: {original_task}",
"name": "market_financial_analysis"
},
{
"pattern": "parallel",
"agents": [data_scientist, strategy_consultant],
"task": "Analyze strategic implications and develop data-driven insights based on: {previous_result}",
"name": "strategic_data_analysis"
},
{
"pattern": "graph",
"agents": [business_writer, quality_reviewer, strategy_consultant],
"dependencies": "BusinessWriter->QualityReviewer, StrategyConsultant->QualityReviewer",
"task": "Create final strategic report with quality validation from: {previous_result}",
"name": "report_synthesis_validation"
}
]
result = await workflow.run_hybrid(
task="Develop a digital transformation strategy for a traditional manufacturing company looking to implement IoT and predictive maintenance solutions",
stages=stages
)
print("Comprehensive Strategy Report:", result["final_result"])
asyncio.run(workflow_example())
DSL Syntax
Use intuitive arrow notation for complex dependencies:
# Simple chain
"A->B->C"
# Parallel branches
"A->B, A->C"
# Fan-in pattern
"A->C, B->C"
# Complex dependencies
"A->B, A->C, B&C->D"
For detailed workflow patterns and examples, see Multi-Agent Workflows and Workflow DSL.
๐ Documentation
Core Documentation
- Configuration Reference - Complete YAML configuration guide and examples
- API Reference - Complete API documentation and parameter reference
- Multi-Agent Workflows - Workflow patterns and orchestration examples
- Workflow DSL - Domain-specific language for defining agent dependencies
Examples
- examples/ - Complete usage examples and demos
- config/ - Configuration file templates
- toolkit/ - Custom tool development examples
Architecture Overview
xAgent is built with a modular architecture:
- Core (
xagent/core/) - Agent and session management - Interfaces (
xagent/interfaces/) - CLI, HTTP server, and web interface - Components (
xagent/components/) - Message storage and persistence - Tools (
xagent/tools/) - Built-in and custom tool ecosystem - Multi (
xagent/multi/) - Multi-agent coordination and workflows
Key Features
- ๏ฟฝ Easy to Use - Simple API for quick prototyping
- โก High Performance - Async/await throughout, concurrent tool execution
- ๐ง Extensible - Custom tools, MCP integration, plugin system
- ๐ Multiple Interfaces - CLI, HTTP API, web interface
- ๐พ Persistent - Redis-backed conversation storage
- ๐ค Multi-Agent - Hierarchical agent systems and workflows
- ๐ Observable - Built-in logging and monitoring
**Key Methods:**
- `async chat(user_message, user_id, session_id, **kwargs) -> str | BaseModel`: Main chat interface
- `async __call__(user_message, user_id, session_id, **kwargs) -> str | BaseModel`: Shorthand for chat
- `as_tool(name, description) -> Callable`: Convert agent to tool
**Chat Method Parameters:**
- `user_message`: The user's message (string or Message object)
- `user_id`: User identifier for message storage (default: "default_user")
- `session_id`: Session identifier for message storage (default: "default_session")
- `history_count`: Number of previous messages to include (default: 16)
- `max_iter`: Maximum model call attempts (default: 10)
- `image_source`: Optional image for analysis (URL, path, or base64)
- `output_type`: Pydantic model for structured output
- `stream`: Enable streaming response (default: False)
**Agent Parameters:**
- `name`: Agent identifier (default: "default_agent")
- `system_prompt`: Instructions for the agent behavior
- `model`: OpenAI model to use (default: "gpt-4.1-mini")
- `client`: Custom AsyncOpenAI client instance
- `tools`: List of function tools
- `mcp_servers`: MCP server URLs for dynamic tool loading
- `sub_agents`: List of sub-agent configurations (name, description, server URL)
## ๐ค Contributing
We welcome contributions! Here's how to get started:
### Development Workflow
1. **Fork** the repository
2. **Create** a feature branch: `git checkout -b feature/amazing-feature`
3. **Commit** your changes: `git commit -m 'Add amazing feature'`
4. **Push** to the branch: `git push origin feature/amazing-feature`
5. **Open** a Pull Request
### Development Guidelines
- Follow PEP 8 standards for code style
- Add tests for new features
- Update documentation as needed
- Use type hints throughout
- Follow conventional commit messages
## ๐ License
This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.
## ๐ Acknowledgments
Special thanks to these amazing open source projects:
- **[OpenAI](https://openai.com/)** - GPT models powering our AI
- **[FastAPI](https://fastapi.tiangolo.com/)** - Robust async API framework
- **[Streamlit](https://streamlit.io/)** - Intuitive web interface
- **[Redis](https://redis.io/)** - High-performance data storage
## ๐ Support & Community
| Resource | Purpose |
|----------|---------|
| **[GitHub Issues](https://github.com/ZJCODE/xAgent/issues)** | Bug reports & feature requests |
| **[GitHub Discussions](https://github.com/ZJCODE/xAgent/discussions)** | Community chat & Q&A |
| **Email: zhangjun310@live.com** | Direct support |
---
<div align="center">
**xAgent** - Empowering conversations with AI ๐
[](https://github.com/ZJCODE/xAgent)
[](https://github.com/ZJCODE/xAgent)
*Built with โค๏ธ for the AI community*
</div>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file myxagent-0.2.9.tar.gz.
File metadata
- Download URL: myxagent-0.2.9.tar.gz
- Upload date:
- Size: 69.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffe54a8ad147d13855ff520517a20830a7d3a986bc3aa3b6176e90576a987292
|
|
| MD5 |
a481c9a5221eadab688f09ca9a66dd7b
|
|
| BLAKE2b-256 |
0c4bba76f03cf90fe53b07dc05152911653646c598c46b2ff93ecf397a8400bd
|
File details
Details for the file myxagent-0.2.9-py3-none-any.whl.
File metadata
- Download URL: myxagent-0.2.9-py3-none-any.whl
- Upload date:
- Size: 75.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db4c1df238532e760e46870dcaccedede90593ed26cc9070c4d91b75cec87039
|
|
| MD5 |
625aa6e0914e29c232d22f8f25b23a18
|
|
| BLAKE2b-256 |
b408adcb00214b5f85a1acf3818187fdcb9d073024d1d54aea2df5252ea84a38
|