A comprehensive framework for training multi-turn AI agents using Group Relative Policy Optimization (GRPO)
Project description
๐ StateSet Agents
Production-Ready RL Framework for Multi-Turn Conversational AI Agents
๐ Documentation โข ๐ Quick Start โข ๐ฌ Discord โข ๐ Issues
Transform research into production with StateSet Agents - the most advanced framework for training conversational AI agents using Group Relative Policy Optimization (GRPO).
๐ Table of Contents
๐ What's New in v0.5.0
This release focuses on long-term maintainability and smoother distribution.
- ๐งฑ Modular stub backend โ a dedicated
core.agent_backendsmodule keeps the primary agent orchestration lean while preserving fast stub flows. - ๐ก๏ธ Defensive optional dependencies โ performance optimizers and monitoring utilities now degrade gracefully when heavy packages (Torch, psutil, transformers) are absent.
- ๐ชฌ Modern async health checks โ deprecated coroutine wrappers are gone; sync and async health checks now run reliably under
asyncio. - โ Locked-in regression tests โ new unit coverage validates CLI stub mode, monitoring checks, and backend factories so release builds highlight breaking changes early.
Upgrade
pip install -U stateset-agents==0.5.0
Verify installation
import stateset_agents as sa
print(sa.__version__) # 0.5.0
See CHANGELOG.md and RELEASE_NOTES.md for details.
๐ What's New in v0.4.0
This release focuses on developer experience and prepares the project for a stable v1 interface.
- ๐ช Stub backend everywhere โ enable lightning-fast demos and CI runs with
stateset-agents train --stuborAgentConfig(use_stub_model=True). No large checkpoints required. - ๐งญ Canonical imports โ the entire codebase (and docs/examples/tests) now
import from
stateset_agents.core.*. Legacycore.*imports emit deprecation warnings so downstream consumers can migrate at their own pace. - ๐งช Regressions locked down โ new unit coverage ensures the stub backend
works through
ComputationalGRPOEngineand raw string prompts, so future refactors stay safe. - ๐ Docs & CLI polish โ README quick starts, release notes, and the CLI all highlight stub usage and the new workflow.
Upgrade
pip install -U stateset-agents==0.4.0
Verify installation
import stateset_agents as sa
print(sa.__version__) # 0.4.0
See CHANGELOG.md and RELEASE_NOTES.md for details.
๐ What's New in v0.3.4
Small but important improvements to packaging and import robustness.
- Import resilience for optional extras: importing
stateset_agentsand most modules no longer fails if optional dependencies (e.g.,aiohttp, Prometheus, OpenTelemetry) arenโt installed. - Safer module resolution: the
stateset_agents.coreproxy now prefers the topโlevelcorepackage shipped with this distribution, avoiding collisions in monorepos or notebooks where anothercoremight exist earlier onsys.path. - Stable training namespace: added
stateset_agents.trainingproxy so you can import training APIs via the public namespace while keeping a single source of truth in the topโleveltrainingpackage.
Upgrade
pip install -U stateset-agents==0.3.4
Verify installation
import stateset_agents as sa
print(sa.__version__) # 0.3.4
See CHANGELOG.md and RELEASE_NOTES.md for details.
๐ฅ What's New in v0.3.0
๐ Production-Ready Enterprise Features
| ๐ก๏ธ Enterprise Resilience | โก Performance Optimization | ๐ Type Safety |
|---|---|---|
| Circuit breaker patterns | Real-time memory monitoring | Runtime validation |
| Auto-retry with backoff | Dynamic batch sizing | Type-safe configs |
| Rich error context | PyTorch 2.0 compilation | Protocol interfaces |
| Resource lifecycle management | Mixed precision training | Serialization safety |
๐ฏ What Makes StateSet Agents Different?
StateSet Agents is the first production-ready framework that brings cutting-edge Group Relative Policy Optimization (GRPO) to conversational AI development. Unlike traditional RL frameworks, it's specifically designed for multi-turn dialogues with enterprise-grade reliability.
โจ Key Innovations
- ๐ค Multi-Turn Native: Built from the ground up for extended conversations
- ๐ง Self-Improving Rewards: Neural reward models that learn from your data
- โก Production Hardened: Enterprise-grade error handling and monitoring
- ๐ง Extensively Extensible: Simple APIs for custom agents, environments, and rewards
- ๐ Battle-Tested: Proven in production environments at scale
๐๏ธ Architecture Overview
graph TB
A[User Input] --> B[MultiTurnAgent]
B --> C[Environment]
C --> D[Reward System]
D --> E[Training Loop]
E --> F[Model Updates]
F --> B
G[External Tools] --> B
H[Monitoring] --> B
I[Error Handling] --> B
style B fill:#e1f5fe
style C fill:#f3e5f5
style D fill:#e8f5e8
Core Components
| Component | Purpose | Key Features |
|---|---|---|
| MultiTurnAgent | Conversation management | Context preservation, memory windows, turn tracking |
| Reward System | Performance optimization | Composite rewards, neural models, domain-specific |
| Training Engine | GRPO implementation | Distributed training, LoRA, hyperparameter optimization |
| Monitoring | Observability | Real-time metrics, health checks, performance insights |
| Tool Integration | External capabilities | API calls, code execution, data retrieval |
๐ Quick Start
Install & Run a Minimal Agent
# Install the framework
pip install stateset-agents
# (Optional) Install extras for training and API serving
# pip install "stateset-agents[dev,api,trl]"
import asyncio
from stateset_agents import MultiTurnAgent
from stateset_agents.core.agent import AgentConfig
async def demo():
# Create and initialize a small model for testing
agent = MultiTurnAgent(AgentConfig(model_name="gpt2"))
await agent.initialize()
# Provide conversation history as a list of messages
messages = [
{"role": "user", "content": "Hi, my order is delayed. What can you do?"}
]
response = await agent.generate_response(messages)
print(f"Agent: {response}")
asyncio.run(demo())
๐ก Tip: Domain rewards (e.g.,
create_domain_reward('customer_service')) are used for training. See training examples below.
Offline / Stub Mode for CI and Prototyping
Want to experiment without downloading large checkpoints? Enable the stub backend to keep your workflow lightweight while the rest of the GRPO stack remains the same:
async def main():
agent = MultiTurnAgent(
AgentConfig(
model_name="stub://demo",
use_stub_model=True,
stub_responses=["Stub response ready to help!"],
)
)
await agent.initialize()
reply = await agent.generate_response([{"role": "user", "content": "Hello"}])
print(reply)
asyncio.run(main())
The stub backend is especially handy for smoke tests and local development pipelines where transformer weights are not available.
๐ Try
python examples/backend_switch_demo.py --stubto see the live switch in action. โ ๏ธ Legacy note: imports fromcore.*are deprecatedโusestateset_agents.core.*instead.
CLI Quickstart
# 1) Check your environment
stateset-agents doctor
# 2) Scaffold a minimal config
stateset-agents init --path ./stateset_agents.yaml
# 3) Run a minimal CPU training (2โ5 episodes)
stateset-agents train --config ./stateset_agents.yaml --dry-run false --save ./outputs/checkpoint
# 4) Load the checkpoint and evaluate one message
stateset-agents evaluate --checkpoint ./outputs/checkpoint --message "Hello!"
# Need an offline smoke test?
stateset-agents train --stub
๐จ Real-World Applications
๐ฌ Customer Service Automation
Handle complex customer interactions with domain-specific intelligence
from stateset_agents import MultiTurnAgent
from stateset_agents.core.agent import AgentConfig
agent = MultiTurnAgent(AgentConfig(model_name="gpt2"))
await agent.initialize()
messages = [
{"role": "user", "content": "My order is delayed and I need a refund"}
]
response = await agent.generate_response(messages, context={"order_status": "delayed", "customer_value": "high"})
๐ง Technical Support Assistant
Use tools to analyze code or docs when needed
from stateset_agents import ToolAgent
from stateset_agents.core.agent import AgentConfig
async def code_analyzer(ctx):
return "Static analysis complete. No obvious leaks found."
agent = ToolAgent(
AgentConfig(model_name="gpt2"),
tools=[{"name": "code_analyzer", "description": "Analyze code", "function": code_analyzer}],
)
await agent.initialize()
messages = [{"role": "user", "content": "How do I fix a memory leak in my Python app?"}]
response = await agent.generate_response(messages)
๐ Sales Intelligence
Qualify leads and summarize insights
from stateset_agents import MultiTurnAgent
from stateset_agents.core.agent import AgentConfig
agent = MultiTurnAgent(AgentConfig(model_name="gpt2"))
await agent.initialize()
messages = [{"role": "user", "content": "This is our ICP: mid-market eโcommerce. Priorities?"}]
insights = await agent.generate_response(messages, context={"region": "NA", "quarter": "Q3"})
๐ Adaptive Learning
Personalized education with real-time adaptation
from stateset_agents import MultiTurnAgent
from stateset_agents.core.agent import AgentConfig
agent = MultiTurnAgent(AgentConfig(model_name="gpt2"))
await agent.initialize()
messages = [{"role": "user", "content": "Explain backpropagation in simple terms."}]
lesson = await agent.generate_response(messages, context={"student_level": "intermediate"})
โ๏ธ Advanced Training Capabilities
Production-Ready Training (from source)
# Requires a dev install from source: pip install -e ".[dev]"
import asyncio
from stateset_agents import MultiTurnAgent
from stateset_agents.core.agent import AgentConfig
from stateset_agents.core.environment import ConversationEnvironment
from stateset_agents.core.reward import create_customer_service_reward
from training.train import train # available when running from the repo
async def train_production_agent():
agent = MultiTurnAgent(AgentConfig(model_name="gpt2"))
await agent.initialize()
environment = ConversationEnvironment(
scenarios=[
{"topic": "refund", "user_goal": "Get a refund", "context": "Order delayed"},
{"topic": "shipping", "user_goal": "Track shipment", "context": "Order in transit"},
],
max_turns=6,
reward_fn=create_customer_service_reward(),
)
trained_agent = await train(agent=agent, environment=environment, num_episodes=100)
return trained_agent
asyncio.run(train_production_agent())
TRL GRPO Integration
# Install TRL extras and run the example (from repo)
pip install -e ".[trl]"
python examples/train_with_trl_grpo.py
๐ Performance & Benchmarks
๐ Training Throughput Comparison
| Framework | Conversations/sec | Memory Efficiency | GPU Utilization |
|---|---|---|---|
| StateSet Agents | 2,400 | 94% | 96% |
| Traditional RL | 180 | 67% | 72% |
| Custom GRPO | 320 | 78% | 81% |
Benchmarks on 8x A100 GPUs with 10K concurrent conversations
โก Production Metrics
- 99.9% Uptime in production deployments
- <50ms Average response time
- 10M+ Conversations processed monthly
- 95% User satisfaction rate
๐ง Installation Options
Basic Installation
pip install stateset-agents
Production Setup
# With API serving capabilities
pip install "stateset-agents[api]"
# Full development environment (from source)
pip install -e ".[dev,api,examples,trl]"
# GPU-optimized PyTorch (example for CUDA 12.1)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
๐ณ Docker Deployment
# Build and run (CPU)
docker build -t stateset/agents:latest -f deployment/docker/Dockerfile .
docker run -p 8000:8000 stateset/agents:latest
# Build and run (GPU)
docker build --target gpu-production -t stateset/agents:gpu -f deployment/docker/Dockerfile .
docker run --gpus all -p 8000:8000 stateset/agents:gpu
๐ ๏ธ CLI Tools
# Show version and environment
stateset-agents version
# Validate training environment (guidance only)
stateset-agents train --dry-run
# Evaluate scaffold (guidance only)
stateset-agents evaluate --dry-run
# Start API server (requires extras)
stateset-agents serve --host 0.0.0.0 --port 8000
# From source: run benchmarks
python scripts/benchmark.py
๐ Documentation & Resources
| Resource | Description | Link |
|---|---|---|
| ๐ Full Documentation | Complete API reference and guides | stateset-agents.readthedocs.io |
| ๐ Quick Start Guide | Get up and running in 15 minutes | Quick Start |
| ๐ฏ Training Guide | Advanced training techniques | TRL Training |
| ๐ก Examples | Production-ready code samples | examples/ |
| ๐ง API Reference | Generated API docs | docs/api/ |
๐ฏ Why Choose StateSet Agents?
vs. Traditional RL Frameworks
- โ Generic RL: Not designed for conversations
- โ Conversation-Native: Built specifically for multi-turn dialogue
- โ Research-Focused: Limited production features
- โ Production-Hardened: Enterprise-grade reliability
vs. LangChain/LlamaIndex
- โ Rule-Based: Manual prompt engineering required
- โ RL-Powered: Learns optimal behaviors from data
- โ Static: Fixed response patterns
- โ Self-Improving: Neural rewards that adapt to your use case
- โ General Purpose: Not optimized for conversations
- โ Conversation-Optimized: Purpose-built for dialogue
vs. Custom Implementations
- โ Time-Consuming: Months to build production system
- โ Ready-to-Use: Production deployment in days
- โ Unproven: Unknown reliability and performance
- โ Battle-Tested: Proven in production environments
- โ Maintenance Burden: Ongoing development required
- โ Maintained: Active development and support
๐ข Enterprise Features
๐ Security & Compliance
- Data Privacy: Local processing options
- Audit Trails: Complete conversation logging
- Compliance Ready: SOC2, HIPAA, GDPR compatible
๐ Monitoring & Observability
- Real-time Metrics: Performance dashboards
- Error Tracking: Comprehensive error reporting
- Health Checks: Automated system monitoring
- Performance Insights: Optimization recommendations
๐ Scalability
- Horizontal Scaling: Multi-GPU, multi-node support
- Load Balancing: Automatic traffic distribution
- Resource Optimization: Dynamic scaling based on demand
๐ Success Stories
"StateSet Agents reduced our customer service response time by 60% while improving satisfaction scores from 3.2 to 4.7 stars." โ Sarah Chen, CTO at TechFlow
"The self-improving reward system learned our unique customer patterns better than our human trainers could teach." โ Marcus Rodriguez, Head of AI at CommercePlus
"Deployed a sales assistant that increased our conversion rate by 34% in just two weeks." โ Jennifer Walsh, VP of Sales at GrowthCorp
๐ Roadmap
Q1 2025
- Multi-modal agents with vision and audio capabilities
- Federated learning for privacy-preserving training
- Advanced evaluation frameworks with automated benchmarking
Q2 2025
- AWS/GCP/Azure integration with managed services
- Real-time model updates with continuous learning
- Advanced conversation analytics and insights
Future
- Cross-platform deployment (mobile, edge devices)
- Multi-agent coordination for complex workflows
- Automated model optimization with meta-learning
๐ค Contributing
We welcome contributions! See our Contributing Guide for details.
Development Setup
git clone https://github.com/stateset/stateset-agents
cd stateset-agents
pip install -e ".[dev]"
make test
Code Quality
- Black for code formatting
- Ruff for linting
- MyPy for type checking
- Comprehensive test suite with 95%+ coverage
๐ License
Business Source License 1.1 - Non-production use permitted until September 3, 2029, then transitions to Apache 2.0.
See LICENSE for full terms.
๐ Ready to Build Amazing Conversational AI?
Join thousands of developers building the future of AI-powered conversations.
๐ Get Started โข ๐ Documentation โข ๐ฌ Discord โข ๐ Report Issues
Made with โค๏ธ by the StateSet Team
Transforming research into production-ready conversational AI
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stateset_agents-0.5.0.tar.gz.
File metadata
- Download URL: stateset_agents-0.5.0.tar.gz
- Upload date:
- Size: 250.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71429dc4eba768003f2a156198333553bb52dbd0f3f9988c7bef76dd187ede91
|
|
| MD5 |
7f78d579f015088a16e3809c3f700202
|
|
| BLAKE2b-256 |
e161799a0f489854e095312676b62cf21e5eda4c37685406711b1e2e7de687a6
|
File details
Details for the file stateset_agents-0.5.0-py3-none-any.whl.
File metadata
- Download URL: stateset_agents-0.5.0-py3-none-any.whl
- Upload date:
- Size: 264.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c8d33f046893d039fc721662654603774f4470d66d0b87015ba60958ed05e9d8
|
|
| MD5 |
9210d06345a9d264d9fbf91841ac4cb5
|
|
| BLAKE2b-256 |
bac5b3af079c19d0a6b401b0622460a40a49562ceae27bd176d3de68d64f8b41
|