Multi-Modal AI Agent System

These details have not been verified by PyPI

Project links

Repository

Development Status
- 4 - Beta
Intended Audience
- Developers
Operating System
- OS Independent
Programming Language
Topic
- Software Development :: Libraries :: Python Modules

Project description

xAgent - Multi-Modal AI Agent System

🚀 A powerful multi-modal AI Agent system with modern architecture

xAgent provides a complete AI assistant experience with text and image processing capabilities, intelligent vocabulary management, and high-performance concurrent tool execution. Built on FastAPI, Streamlit, and Redis for production-ready scalability.

✨ Key Features

🤖 Core AI Capabilities

Multi-Modal Conversations: Engage in rich conversations with support for both text (via models like GPT-4o) and image inputs.
Persistent Sessions: Leverages Redis to maintain conversation history, ensuring seamless and stateful interactions across sessions.
Extensible Tool System: Easily integrate custom synchronous or asynchronous functions as tools. The system automatically handles sync-to-async conversion for non-blocking execution.
Concurrent Tool Execution: Capable of running multiple tools in parallel, significantly improving response times for complex queries.
Structured Outputs: Define response structure using Pydantic models to get reliable, typed data from the agent.
Agent as a Tool: A powerful pattern where specialized agents can be converted into tools, allowing a coordinator agent to delegate complex tasks.
MCP Integration: Dynamically loads and refreshes tools from external sources using the Model Context Protocol (MCP).

🔧 Developer-Focused Design

Modern Async Architecture: Built from the ground up with asyncio for high-performance, non-blocking operations.
Standalone HTTP Server: Expose agent functionality via a REST API, complete with streaming support for real-time responses. See the HTTP Agent Server section for details.
Modular and Pluggable: The clear separation of components like Agent, Session, and MessageDB makes the system easy to extend and maintain.
Ready-to-Use Frontend: Includes a Streamlit-based chat application for immediate interaction and testing.
Observability: Integrated with Langfuse for detailed tracing and monitoring of agent interactions.

🏗️ Architecture

Modern Design for High Performance

xAgent/
├── 🤖 xagent/                # Core async agent framework
│   ├── core/                 # Agent and session management
│   │   ├── agent.py          # Main Agent class with chat
│   │   ├── session.py        # Session management with operations
│   │   └── server.py         # Standalone HTTP Agent Server
│   ├── db/                   # Database layer (Redis)
│   │   └── message.py        # Message persistence
│   ├── schemas/              # Data models and types (Pydantic)
│   │   └── message.py        # Message and ToolCall models
│   ├── tools/                # Tool ecosystem
│   │   ├── __init__.py       # Tool registry (web_search, draw_image)
│   │   ├── openai_tool.py    # OpenAI tool integrations
│   │   └── mcp_demo/         # MCP demo server and client
│   └── utils/                # Utility functions
│       ├── tool_decorator.py # Tool decorators
│       ├── mcp_convertor.py  # MCP client
│       └── image_upload.py   # AWS S3 image upload utility
├── 🛠️ toolkit/               # Custom tool ecosystem
│   ├── __init__.py           # Toolkit registry
│   ├── tools.py              # Custom tools (char_count)
│   ├── mcp_server.py         # Main MCP server
│   └── vocabulary/           # Vocabulary learning system
├── ⚙️ config/                # Configuration files
│   └── agent.yaml            # Agent server configuration
├── 🎨 frontend/              # Streamlit web interface  
│   └── chat_app.py           # Main chat application
├── 📝 examples/              # Usage examples and demos
└── 🧪 tests/                 # Comprehensive test suite

🔄 Core Components

Component	Purpose	Technology
Agent	Core conversation handler	OpenAI API + AsyncIO
Session	Message history management	Redis + Operations
MessageDB	Scalable persistence layer	Redis with client
Tools	Extensible function ecosystem	Auto sync-to-async conversion
MCP	Dynamic tool loading protocol	HTTP client

🚀 Quick Start

Prerequisites

Requirement	Version	Purpose
Python	3.12+	Core runtime
Redis	7.0+	Message persistence
OpenAI API Key	-	AI model access

Installation

Clone and Setup

git clone https://github.com/ZJCODE/xAgent.git
cd xAgent
pip install -r requirements.txt

install by using pip

pip install myxagent

Environment Configuration

# Copy and edit environment file
cp .env.example .env

Required variables

OPENAI_API_KEY=your_openai_api_key

Optional variables

REDIS_URL=your_redis_url_with_password

LANGFUSE_SECRET_KEY=your_langfuse_key
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_HOST=https://cloud.langfuse.com

AWS_ACCESS_KEY_ID=your_aws_access_key_id
AWS_SECRET_ACCESS_KEY=your_aws_secret_access_key
AWS_REGION=us-east-1
BUCKET_NAME=your_bucket_name

Running the Application

🚀 Quick Start (All Services)

chmod +x run.sh
./run.sh

⚙️ Manual Start (Individual Services)

# Terminal 1: Standalone HTTP Agent Server
python xagent/core/server.py --config config/agent.yaml --toolkit toolkit

# Terminal 2: MCP Server
python toolkit/mcp_server.py

# Terminal 3: Frontend
streamlit run frontend/chat_app.py --server.port 8501

🌐 Access Points

Service	URL	Description
Chat Interface	http://localhost:8501	Main user interface
API Docs	http://localhost:8000/docs	Interactive API documentation
Health Check	http://localhost:8000/health	Service status monitoring
HTTP Agent Server	http://localhost:8010/chat	Standalone agent HTTP API

💡 Usage Examples

📘 Basic Chat

import asyncio
from xagent.core import Agent, Session
from xagent.tools import web_search

async def main():
    # Create agent with modern architecture
    agent = Agent(
        name="my_assistant",
        system_prompt="You are a helpful AI assistant.",
        model="gpt-4.1-mini",
        tools=[web_search]  # Add web search tool
        stream=False  # Set to True for streaming responses
    )

    # Create session for conversation management
    session = Session(session_id="session456")

    # Chat interaction
    response = await agent.chat("Hello, how are you?", session)
    print(response)

    # Continue conversation with context
    response = await agent.chat("What's the weather like in Hangzhou?", session)
    print(response)

    # Streaming response example
    response = await agent.chat("Hello, how are you?", session,stream=True)
    async for event in response:
        print(event)


asyncio.run(main())

🗄️ Advanced Chat with Redis Persistence

import asyncio
from xagent.core import Agent, Session
from xagent.db import MessageDB

async def chat_with_persistence():
    # Initialize Redis-backed message storage
    message_db = MessageDB()
    
    # Create agent
    agent = Agent(
        name="persistent_agent",
        model="gpt-4.1-mini",
        tools=[]
    )

    # Create session with Redis persistence
    session = Session(
        user_id="user123", 
        session_id="persistent_session",
        message_db=message_db
    )

    # Chat with automatic message persistence
    response = await agent.chat("Remember this: my favorite color is blue", session)
    print(response)
    
    # Later conversation - context is preserved in Redis
    response = await agent.chat("What's my favorite color?", session)
    print(response)

asyncio.run(chat_with_persistence())

🔧 Custom Tools (Sync and Async)

import asyncio
import time
import httpx
from xagent.utils.tool_decorator import function_tool
from xagent.core import Agent, Session

# Sync tools - automatically converted to async
@function_tool()
def calculate_square(n: int) -> int:
    """Calculate square of a number (CPU-intensive)."""
    time.sleep(0.1)  # Simulate CPU work
    return n * n

@function_tool()
def format_text(text: str, style: str) -> str:
    """Format text with various styles."""
    if style == "upper":
        return text.upper()
    elif style == "title":
        return text.title()
    return text

# Async tools - used directly for I/O operations
@function_tool()
async def fetch_weather(city: str) -> str:
    """Fetch weather data from API."""
    async with httpx.AsyncClient() as client:
        # Simulate weather API call
        await asyncio.sleep(0.5)
        return f"Weather in {city}: 22°C, Sunny"

async def main():
    # Mix of sync and async tools
    agent = Agent(
        tools=[calculate_square, format_text, fetch_weather],
        model="gpt-4.1-mini"
    )
    
    session = Session(user_id="user123")
    
    # Agent handles all tools automatically - sync tools run in thread pool
    response = await agent.chat(
        "Calculate the square of 15, format 'hello world' in title case, and get weather for Tokyo",
        session
    )
    print(response)

asyncio.run(main())

🔧 MCP Protocol Integration

import asyncio
from xagent.core import Agent, Session

async def mcp_integration_example():
    # Create agent with MCP tools
    agent = Agent(
        tools=[],
        mcp_servers=["http://localhost:8001/mcp/"],  # Auto-refresh MCP tools
        model="gpt-4.1-mini"
    )
    
    session = Session(user_id="user123")
    
    # Use MCP tools automatically
    response = await agent.chat("Use the available MCP tools to help me", session)
    print(response)

asyncio.run(mcp_integration_example())

📊 Structured Output with Pydantic

import asyncio
from pydantic import BaseModel
from xagent.core import Agent, Session
from xagent.tools import web_search

class WeatherReport(BaseModel):
    location: str
    temperature: int
    condition: str
    humidity: int

class Step(BaseModel):
    explanation: str
    output: str

class MathReasoning(BaseModel):
    steps: list[Step]
    final_answer: str

async def get_structured_response():
    agent = Agent(model="gpt-4.1-mini", tools=[web_search])
    session = Session(user_id="user123")
    
    # Request structured output for weather
    weather_data = await agent.chat(
        "what's the weather like in Hangzhou?",
        session,
        output_type=WeatherReport
    )
    
    print(f"Location: {weather_data.location}")
    print(f"Temperature: {weather_data.temperature}°F")
    print(f"Condition: {weather_data.condition}")
    print(f"Humidity: {weather_data.humidity}%")

    # Request structured output for mathematical reasoning
    reply = await agent.chat(
        "how can I solve 8x + 7 = -23", 
        session, 
        output_type=MathReasoning
    )
    for index, step in enumerate(reply.steps):
        print(f"Step {index + 1}: {step.explanation} => Output: {step.output}")
    print("Final Answer:", reply.final_answer)

asyncio.run(get_structured_response())

🤖 Agent as Tool Pattern

import asyncio
from xagent.core import Agent, Session
from xagent.db import MessageDB
from xagent.tools import web_search

async def agent_as_tool_example():
    # Create specialized agents
    researcher_agent = Agent(
        name="research_specialist",
        system_prompt="Research expert. Gather information and provide insights.",
        model="gpt-4.1-mini",
        tools=[web_search]
    )
    
    writing_agent = Agent(
        name="writing_specialist", 
        system_prompt="Professional writer. Create engaging content.",
        model="gpt-4.1-mini"
    )
    
    # Convert agents to tools
    message_db = MessageDB()
    research_tool = researcher_agent.as_tool(
        name="researcher",
        description="Research topics and provide detailed analysis",
        message_db=message_db
    )
    
    writing_tool = writing_agent.as_tool(
        name="content_writer",
        description="Write and edit content",
        message_db=message_db
    )
    
    # Main coordinator agent with specialist tools
    coordinator = Agent(
        name="coordinator",
        tools=[research_tool, writing_tool],
        system_prompt="Coordination agent that delegates to specialists.",
        model="gpt-4.1"
    )
    
    session = Session(user_id="user123")
    
    # Complex multi-step task
    response = await coordinator.chat(
        "Research renewable energy benefits and write a brief summary",
        session
    )
    print(response)

asyncio.run(agent_as_tool_example())

🌐 HTTP Agent Server

xAgent provides a standalone HTTP server that exposes the Agent functionality through REST API endpoints. This allows integration with other systems and services through simple HTTP calls.

🚀 Starting the HTTP Server

# Start with default config
python xagent/core/server.py --config config/agent.yaml --toolkit toolkit

# Server will start on http://localhost:8010 by default

After installing the package, you can use the xagent-server command:

# Start the server using the installed command
xagent-server --config /path/to/your/config.yaml --toolkit /path/to/your/toolkit

🏃 Programmatic Usage

You can also start the HTTP Agent Server directly from Python:

from xagent.core.server import HTTPAgentServer

# Create and run the HTTP Agent Server
server = HTTPAgentServer(config_path = "config/agent.yaml",toolkit_path = "toolkit")

# Run the server
server.run(host="0.0.0.0", port=8010)

⚙️ Configuration

The HTTP server is configured through a YAML file (e.g., config/agent.yaml):

agent:
  name: "Agent"
  system_prompt: |
    You are a helpful assistant. Your task is to assist users with their queries and tasks.
  model: "gpt-4.1-mini"
  mcp_servers:
    - "http://localhost:8001/mcp/"
  tools:
    - "web_search" # built-in web search tool
    - "draw_image" # built-in image drawing tool
    - "char_count" # custom tool for counting characters
  use_local_session: false

server:
  host: "0.0.0.0"
  port: 8010

📡 API Endpoints

POST `/chat`

Main chat endpoint for interacting with the AI agent.

Request Body:

{
  "user_id": "string",      
  "session_id": "string",   
  "user_message": "string", 
  "image_source": "string",
  "stream": false
}

image_source: Image URL or base64 encoded image (Optional)
stream: Set to true to enable streaming response (Optional, defaults to false)

Standard Response (stream: false):

{
  "reply": "string"
}

Streaming Response (stream: true):

The server will stream Server-Sent Events (SSE). Each event is a JSON object.

Data Event: data: {"delta": "some text"}
Completion Event: data: [DONE]

💡 Usage Examples

Basic Chat Request

curl -X POST "http://localhost:8010/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "user123",
    "session_id": "session456",
    "user_message": "Hello, how are you?"
  }'

streaming response example:

curl -X POST "http://localhost:8010/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "user123",
    "session_id": "session456",
    "user_message": "Hello, how are you?",
    "stream": true
  }'

Chat with Image

curl -X POST "http://localhost:8010/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "user123",
    "session_id": "session456", 
    "user_message": "What do you see in this image?",
    "image_source": "https://example.com/image.jpg"
  }'

🔧 Development Guide

🛠️ Creating Tools

Both sync and async functions work seamlessly:

from xagent.utils.tool_decorator import function_tool
import asyncio
import time

# ✅ Sync tool - perfect for CPU-bound operations
@function_tool()
def my_sync_tool(input_text: str) -> str:
    """Process text synchronously (runs in thread pool)."""
    time.sleep(0.1)  # Simulate CPU-intensive work
    return f"Sync processed: {input_text}"

# ✅ Async tool - ideal for I/O-bound operations  
@function_tool()
async def my_async_tool(input_text: str) -> str:
    """Process text asynchronously."""
    await asyncio.sleep(0.1)  # Simulate async I/O operation
    return f"Async processed: {input_text}"

📋 Tool Development Guidelines

Use Case	Tool Type	Example
CPU-bound	Sync functions	Math calculations, data processing
I/O-bound	Async functions	API calls, database queries
Simple operations	Sync functions	String manipulation, file operations
Network requests	Async functions	HTTP requests, WebSocket connections

⚠️ Note: Recursive functions are not supported as tools due to potential stack overflow issues in async environments.

🔄 Automatic Conversion

xAgent's @function_tool() decorator automatically handles sync-to-async conversion:

Sync functions → Run in thread pool (non-blocking)
Async functions → Run directly on event loop
Concurrent execution → All tools execute in parallel when called

📝 Override Defaults

You can override the default tool name and description using the function_tool decorator:

@function_tool(name="custom_square", description="Calculate the square of a number")
def calculate_square(n: int) -> int:
    return n * n

🤖 API Reference

Core Classes

🤖 Agent

Main AI agent class for handling conversations and tool execution.

Agent(
    name: Optional[str] = None,
    system_prompt: Optional[str] = None, 
    model: Optional[str] = None,
    client: Optional[AsyncOpenAI] = None,
    tools: Optional[list] = None,
    mcp_servers: Optional[str | list] = None
)

Key Methods:

async chat(user_message, session, **kwargs) -> str | BaseModel: Main chat interface
async __call__(user_message, session, **kwargs) -> str | BaseModel: Shorthand for chat
as_tool(name, description, message_db) -> Callable: Convert agent to tool

Parameters:

name: Agent identifier (default: "default_agent")
system_prompt: Instructions for the agent behavior
model: OpenAI model to use (default: "gpt-4.1-mini")
client: Custom AsyncOpenAI client instance
tools: List of function tools
mcp_servers: MCP server URLs for dynamic tool loading

💬 Session

Manages conversation history and persistence with operations.

Session(
    user_id: str,
    session_id: Optional[str] = None,
    message_db: Optional[MessageDB] = None
)

Key Methods:

async add_messages(messages: Message | List[Message]) -> None: Store messages
async get_messages(count: int = 20) -> List[Message]: Retrieve message history
async clear_session() -> None: Clear conversation history
async pop_message() -> Optional[Message]: Remove last non-tool message

Features:

Automatic fallback to in-memory storage if no MessageDB provided
Redis-backed persistence for production use
Thread-safe operations
Efficient message batching

🗄️ MessageDB

Redis-backed message persistence layer.

# Initialize with environment variables or defaults
message_db = MessageDB()

# Usage with session
session = Session(
    user_id="user123",
    message_db=message_db
)

Important Considerations

Aspect	Details
Tool functions	Can be sync or async (automatic conversion)
Agent interactions	Always use `await`
Context	Run in context with `asyncio.run()`
Concurrency	All tools execute in parallel automatically

📊 Monitoring & Observability

xAgent includes comprehensive observability features:

🔍 Langfuse Integration - Track AI interactions and performance
📝 Structured Logging - Throughout the entire system
❤️ Health Checks - API monitoring endpoints
⚡ Performance Metrics - Tool execution time and success rates

🤝 Contributing

We welcome contributions! Here's how to get started:

Development Workflow

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Commit your changes: git commit -m 'Add amazing feature'
Push to the branch: git push origin feature/amazing-feature
Open a Pull Request

Development Guidelines

Area	Requirements
Code Style	Follow PEP 8 standards
Testing	Add tests for new features
Documentation	Update docs as needed
Type Safety	Use type hints throughout
Commits	Follow conventional commit messages

Package Upload

First time upload

pip install build twine
python -m build
twine upload dist/*

Subsequent uploads

rm -rf dist/ build/ *.egg-info/
python -m build
twine upload dist/*

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Special thanks to the amazing open source projects that make xAgent possible:

OpenAI - GPT models powering our AI
FastAPI - Robust async API framework
Streamlit - Intuitive web interface
Redis - High-performance data storage
Langfuse - Observability and monitoring

📞 Support & Community

Resource	Link	Purpose
🐛 Issues	GitHub Issues	Bug reports & feature requests
💬 Discussions	GitHub Discussions	Community chat & Q&A
📧 Email	zhangjun310@live.com	Direct support

xAgent - Empowering conversations with AI 🚀

Built with ❤️ for the AI community

Project details

These details have not been verified by PyPI

Project links

Repository

Development Status
- 4 - Beta
Intended Audience
- Developers
Operating System
- OS Independent
Programming Language
Topic
- Software Development :: Libraries :: Python Modules

Release history Release notifications | RSS feed

0.2.42

Mar 26, 2026

0.2.41

Mar 25, 2026

0.2.40

Mar 24, 2026

0.2.39

Mar 20, 2026

0.2.37

Mar 17, 2026

0.2.36

Mar 13, 2026

0.2.35

Mar 13, 2026

0.2.34

Mar 12, 2026

0.2.32

Aug 28, 2025

0.2.31

Aug 28, 2025

0.2.30

Aug 28, 2025

0.2.29

Aug 28, 2025

0.2.28

Aug 28, 2025

0.2.27

Aug 28, 2025

0.2.26

Aug 28, 2025

0.2.25

Aug 27, 2025

0.2.24

Aug 27, 2025

0.2.22

Aug 27, 2025

0.2.21

Aug 27, 2025

0.2.20

Aug 27, 2025

0.2.19

Aug 26, 2025

0.2.18

Aug 26, 2025

0.2.17

Aug 25, 2025

0.2.16

Aug 22, 2025

0.2.15

Aug 22, 2025

0.2.14

Aug 21, 2025

0.2.13

Aug 20, 2025

0.2.12

Aug 19, 2025

0.2.11

Aug 19, 2025

0.2.10

Aug 19, 2025

0.2.9

Aug 19, 2025

0.2.8

Aug 19, 2025

0.2.7

Aug 19, 2025

0.2.6

Aug 19, 2025

0.2.5

Aug 19, 2025

0.2.4

Aug 18, 2025

0.2.3

Aug 18, 2025

0.2.2

Aug 18, 2025

0.2.1

Aug 18, 2025

0.1.27

Aug 15, 2025

0.1.26

Aug 15, 2025

0.1.25

Aug 15, 2025

0.1.24

Aug 15, 2025

0.1.23

Aug 14, 2025

0.1.21

Aug 14, 2025

0.1.20

Aug 14, 2025

0.1.19

Aug 14, 2025

0.1.18

Aug 14, 2025

0.1.17

Aug 14, 2025

0.1.16

Aug 14, 2025

0.1.15

Aug 14, 2025

0.1.14

Aug 13, 2025

0.1.13

Aug 13, 2025

0.1.12

Aug 13, 2025

0.1.11

Aug 13, 2025

0.1.10

Aug 13, 2025

0.1.9

Aug 12, 2025

0.1.8

Aug 12, 2025

0.1.7

Aug 12, 2025

0.1.6

Aug 12, 2025

0.1.5

Aug 12, 2025

0.1.4

Aug 12, 2025

0.1.3

Aug 12, 2025

0.1.2

Aug 12, 2025

This version

0.1.1

Aug 12, 2025

0.1.0

Aug 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

myxagent-0.1.1-py3-none-any.whl (40.6 kB view details)

Uploaded Aug 12, 2025 Python 3

File details

Details for the file myxagent-0.1.1-py3-none-any.whl.

File metadata

Download URL: myxagent-0.1.1-py3-none-any.whl
Upload date: Aug 12, 2025
Size: 40.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for myxagent-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`45253733920a52239778650a4eb5f31f598fb588a318dd985411b248a07ba2c1`
MD5	`b0f3868ccba846fbdc129c583c772822`
BLAKE2b-256	`b37fbc8f404b4235ca23ea40422ccd18e91f3b968a6dc7c208beb1d4eaedbf36`

See more details on using hashes here.

myxagent 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

xAgent - Multi-Modal AI Agent System

📋 Table of Contents

✨ Key Features

🤖 Core AI Capabilities

🔧 Developer-Focused Design

🏗️ Architecture

🔄 Core Components

🚀 Quick Start

Prerequisites

Installation

Running the Application

🚀 Quick Start (All Services)

⚙️ Manual Start (Individual Services)

🌐 Access Points

💡 Usage Examples

📘 Basic Chat

🗄️ Advanced Chat with Redis Persistence

🔧 Custom Tools (Sync and Async)

🔧 MCP Protocol Integration

📊 Structured Output with Pydantic

🤖 Agent as Tool Pattern

🌐 HTTP Agent Server

🚀 Starting the HTTP Server

🏃 Programmatic Usage

⚙️ Configuration

📡 API Endpoints

POST /chat

💡 Usage Examples

Basic Chat Request

Chat with Image

🔧 Development Guide

🛠️ Creating Tools

📋 Tool Development Guidelines

🔄 Automatic Conversion

📝 Override Defaults

🤖 API Reference

Core Classes

Important Considerations

📊 Monitoring & Observability

🤝 Contributing

Development Workflow

Development Guidelines

Package Upload

📄 License

🙏 Acknowledgments

📞 Support & Community

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

POST `/chat`