A production-ready multi-agent orchestration framework built on Claude Agent SDK. Design, compose, and deploy complex AI workflows with pre-built architecture patterns.

These details have not been verified by PyPI

Project links

Project description

Claude Agent Framework

A production-ready multi-agent orchestration framework built on Claude Agent SDK. Design, compose, and deploy complex AI workflows with pre-built architecture patterns.

中文文档 | Best Practices Guide

Overview

Claude Agent Framework is a production-ready orchestration layer for building multi-agent AI systems. It addresses the fundamental challenge of complex tasks that require diverse specialized capabilities—research, analysis, code generation, decision-making—which cannot be effectively handled by a single LLM prompt. The framework decomposes these tasks into coordinated workflows where a lead agent orchestrates specialized subagents, each with focused prompts, constrained tool access, and appropriate model selection. Built on Claude Agent SDK, it provides battle-tested patterns extracted from real-world applications, comprehensive observability through hook-based tracking, and a simple API that lets you go from concept to working system in minutes.

Key Features:

7 Pre-built Patterns - Research, Pipeline, Critic-Actor, Specialist Pool, Debate, Reflexion, MapReduce
2-Line Quick Start - Initialize and run with minimal code
Production Plugin System - 9 lifecycle hooks for metrics, cost tracking, retry handling, and custom logic
Advanced Configuration - Pydantic validation, multi-source loading (YAML/env), environment profiles
Performance Tracking - Token usage, cost estimation, memory profiling, multi-format export (JSON/CSV/Prometheus)
Dynamic Agent Registry - Register and modify agents at runtime without code changes
Full Observability - Structured JSONL logging, interactive dashboards, session debugging tools
CLI Enhancement - Metrics viewing, session visualization, HTML report generation
Cost Control - Automatic model selection, budget limits, per-agent cost breakdown
Extensible Architecture - Register custom patterns with a simple decorator

from claude_agent_framework import init

session = init("research")
async for msg in session.run("Analyze AI market trends"):
    print(msg)

Design Philosophy

Why Multi-Agent?

Complex tasks often require multiple specialized capabilities that a single LLM prompt cannot effectively handle. Consider a research task: it needs web searching, data analysis, and report writing - each requiring different tools, prompts, and even models. A monolithic approach leads to:

Prompt bloat: One prompt trying to do everything becomes unwieldy
Tool overload: Agent has access to tools it shouldn't use at certain stages
Quality degradation: Jack-of-all-trades prompts underperform specialized ones
Cost inefficiency: Using expensive models for simple subtasks

Core Architecture

Claude Agent Framework solves this through agent specialization and orchestration:

User Request
      ↓
Lead Agent (Orchestrator)
      │
      ├── Analyzes task requirements
      ├── Decomposes into subtasks
      ├── Dispatches to specialized subagents
      ├── Coordinates execution flow
      └── Synthesizes final output
            ↓
      Subagents (Specialists)
      │
      ├── Focused prompts for specific tasks
      ├── Minimal tool access (least privilege)
      ├── Cost-effective models where appropriate
      └── Communicate via filesystem (loose coupling)

Design Principles

Principle	Rationale
Separation of Concerns	Lead orchestrates, subagents execute - clear responsibilities
Tool Constraints	Each agent gets only the tools it needs - security and focus
Loose Coupling	Filesystem-based data exchange - agents are independent
Observability	Hook mechanism captures all tool calls - debugging and audit
Cost Optimization	Match model capability to task complexity

Orchestration Patterns

The framework provides 7 patterns for different workflow needs:

Pattern	Use Case	Flow
Research	Data gathering	Parallel workers → Aggregation
Pipeline	Sequential processing	Stage A → B → C → D
Critic-Actor	Quality iteration	Generate ↔ Evaluate loop
Specialist Pool	Expert routing	Router → Domain experts
Debate	Decision analysis	Pro ↔ Con → Judge
Reflexion	Complex problem solving	Execute → Reflect → Improve
MapReduce	Large-scale processing	Split → Map → Reduce

For implementation details, see Best Practices Guide.

Quick Start

pip install claude-agent-framework
export ANTHROPIC_API_KEY="your-api-key"

from claude_agent_framework import init
import asyncio

async def main():
    session = init("research")
    async for msg in session.run("Analyze AI market trends in 2024"):
        print(msg)

asyncio.run(main())

Available Architectures

Architecture	Use Case	Pattern
research	Deep research tasks	Master-worker with parallel data gathering
pipeline	Code review, content creation	Sequential stage processing
critic_actor	Quality improvement	Generate-evaluate iteration loop
specialist_pool	Technical support	Expert routing and dispatch
debate	Decision support	Pro-con deliberation with judge
reflexion	Complex problem solving	Execute-reflect-improve cycle
mapreduce	Large-scale analysis	Parallel map with aggregation

Production Examples

The framework includes 7 production-grade examples demonstrating real-world business scenarios. Each example showcases a specific architecture pattern applied to solve genuine enterprise challenges.

Example Overview

Example	Architecture	Business Scenario	Core Design Pattern
01_competitive_intelligence	Research	SaaS competitive analysis	Parallel data gathering → Synthesis
02_pr_code_review	Pipeline	Automated PR review	Sequential stage gating with quality thresholds
03_marketing_content	Critic-Actor	Marketing copy optimization	Generate → Evaluate → Improve loop
04_it_support	Specialist Pool	IT support routing	Keyword-based expert dispatch with urgency categorization
05_tech_decision	Debate	Technical decision support	Multi-round deliberation with weighted criteria
06_code_debugger	Reflexion	Adaptive debugging	Execute → Reflect → Adapt strategy
07_codebase_analysis	MapReduce	Large codebase analysis	Intelligent chunking → Parallel map → Aggregate

Design Highlights

1. Competitive Intelligence (Research Architecture)

Pattern: Fan-out/Fan-in with parallel worker coordination

Key Design Decisions:

Parallel Dispatch: Multiple researchers analyze different competitors simultaneously
Multi-Channel Aggregation: Official websites, market reports, customer reviews → single unified view
SWOT Generation: Automated strengths/weaknesses/opportunities/threats analysis
Structured Output: JSON/Markdown/PDF reports with consistent formatting

Technical Highlights:

# Parallel researcher dispatch
Lead Agent → [Industry Researcher, Competitor Analyst 1, Competitor Analyst 2, ...] → Report Generator
# Each researcher works independently, results aggregated by lead

Use Case: When you need to quickly gather competitive intelligence across multiple targets with parallel data collection

2. PR Code Review (Pipeline Architecture)

Pattern: Sequential stage processing with quality gates

Key Design Decisions:

5-Stage Pipeline: Architecture → Code Quality → Security → Performance → Test Coverage
Configurable Thresholds: Max complexity (10), min coverage (80%), max file size (500 lines)
Failure Strategies: stop_on_critical (fail fast) vs continue_all (full audit)
Progressive Refinement: Each stage builds on previous stage's findings

Technical Highlights:

# Sequential execution with conditional gating
Stage 1 (Architecture) → [Pass] → Stage 2 (Quality) → [Warning] → Stage 3 (Security) → ...
                                                    ↓ [CRITICAL]
                                                  STOP (if stop_on_critical)

Use Case: When code changes must pass through multiple independent review checkpoints before approval

3. Marketing Content (Critic-Actor Architecture)

Pattern: Iterative refinement through generate-evaluate loops

Key Design Decisions:

Weighted Evaluation: SEO (25%), Engagement (30%), Brand (25%), Accuracy (20%)
Brand Voice Enforcement: Prohibited phrases detection, tone consistency checks
Quality Threshold: Stop when score ≥ 85% or max iterations reached
A/B Variant Generation: Generate multiple angles for the same message

Technical Highlights:

# Iterative improvement loop
while quality_score < threshold and iterations < max:
    content = Actor.generate()
    scores = Critic.evaluate(content)  # Multi-dimensional weighted scoring
    if scores.overall >= threshold: break
    content = Actor.improve(scores.feedback)

Use Case: When content quality must meet strict brand and engagement standards through iterative refinement

4. IT Support (Specialist Pool Architecture)

Pattern: Dynamic expert routing with priority-based dispatch

Key Design Decisions:

Urgency Categorization: Critical (1hr SLA), High (4hr), Medium (24hr), Low (72hr)
Keyword-Based Routing: Match issue keywords to specialist expertise domains
Parallel Consultation: Complex issues can trigger multiple specialists (up to 3)
Fallback Mechanism: General IT specialist handles unmatched issues

Technical Highlights:

# Dynamic specialist selection
Issue → Urgency Categorizer → Keyword Matcher → [Network, Database, Security] → Consolidator
                                              ↓ (if no match)
                                          [General IT Specialist]

Use Case: When support issues need intelligent routing to domain experts based on content and urgency

5. Tech Decision (Debate Architecture)

Pattern: Adversarial deliberation with structured argumentation

Key Design Decisions:

3-Round Structure: Opening Arguments → Deep Analysis → Rebuttals
Weighted Criteria: Technical (30%), Implementation (25%), Cost (25%), Risk (20%)
Evidence-Based: Arguments must cite data, industry research, or technical specs
Expert Panel Judgment: Multi-expert evaluation with dissenting opinions allowed

Technical Highlights:

# Structured multi-round debate
Round 1: Proponent.argue() ↔ Opponent.argue()  # Opening positions
Round 2: Proponent.analyze() ↔ Opponent.analyze()  # Evidence-based
Round 3: Proponent.rebuttal() ↔ Opponent.rebuttal()  # Counter-arguments
Final: Judge.evaluate(all_arguments, weighted_criteria)

Use Case: When technical decisions require balanced analysis of tradeoffs with structured deliberation

6. Code Debugger (Reflexion Architecture)

Pattern: Self-improving execution through reflection loops

Key Design Decisions:

Strategy Library: Error trace analysis, code inspection, hypothesis testing, dependency check
Adaptive Strategy Selection: Reflector analyzes why previous attempts failed and suggests next approach
Root Cause Taxonomy: Categorize bugs (logic error, race condition, resource leak, etc.)
Prevention Recommendations: Learn from bug patterns to suggest prevention measures

Technical Highlights:

# Execute-reflect-improve loop
while not root_cause_found and iterations < max:
    result = Executor.execute(current_strategy)
    reflection = Reflector.analyze(result, history)  # Why failed? What learned?
    next_strategy = Improver.select_strategy(reflection)  # Adapt approach
    history.append({strategy, result, reflection})

Use Case: When debugging complex issues requires systematic exploration with learning from failed attempts

7. Codebase Analysis (MapReduce Architecture)

Pattern: Divide-conquer with intelligent chunking and aggregation

Key Design Decisions:

Chunking Strategies: By module, by file type, by size, by git change frequency
Parallel Mapping: Up to 10 concurrent mappers analyzing different chunks
Weighted Scoring: Quality (25%), Security (30%), Maintainability (25%), Coverage (20%)
Issue Aggregation: Deduplication, severity-based prioritization, module health scoring

Technical Highlights:

# Parallel map-reduce workflow
Codebase → Smart Chunker → [Mapper 1, Mapper 2, ..., Mapper N] → Reducer
           (by_module)      (parallel analysis)                   (aggregate, dedupe, prioritize)

Use Case: When analyzing large codebases (500+ files) requires parallel processing with intelligent result aggregation

Common Implementation Patterns

All examples demonstrate these production-ready patterns:

Pattern	Implementation	Benefit
Configuration-Driven	YAML config with validation	Easy customization without code changes
Structured Results	Consistent JSON output format	Programmatic access and integration
Error Handling	Try/catch with graceful degradation	Robust production deployment
Logging	Structured JSONL + human-readable logs	Debugging and audit trail
Testing	Unit + integration + end-to-end tests	Quality assurance and regression prevention

Getting Started

Each example includes:

✅ Complete runnable code with error handling
✅ Configuration files with detailed comments
✅ Custom components and prompt engineering
✅ Unit tests, integration tests, and end-to-end tests
✅ Comprehensive documentation (English + Chinese)
✅ Usage guides and customization instructions

See Production Examples Design Document for detailed implementation specifications.

Running Examples

# Navigate to example directory
cd examples/production/01_competitive_intelligence

# Install dependencies
pip install -e ".[all]"

# Configure
cp config.example.yaml config.yaml
# Edit config.yaml with your settings

# Run
python main.py

Architecture Diagrams

Research Architecture

User Request
      ↓
Lead Agent (Coordinator)
      ├─→ Researcher-1 ─┐
      ├─→ Researcher-2 ─┼─→ Parallel Research
      └─→ Researcher-3 ─┘
             ↓
      Data-Analyst
             ↓
      Report-Writer
             ↓
      Output Files

Pipeline Architecture

Request → Architect → Coder → Reviewer → Tester → Output

Critic-Actor Architecture

while quality < threshold:
    content = Actor.generate()
    feedback = Critic.evaluate()
    if approved: break

Specialist Pool Architecture

User Question → Router → [Code Expert, Data Expert, Security Expert, ...] → Summary

Debate Architecture

Topic → Proponent ↔ Opponent (N rounds) → Judge → Verdict

Reflexion Architecture

while not success:
    result = Executor.execute()
    reflection = Reflector.analyze()
    strategy = reflection.improved_strategy

MapReduce Architecture

Task → Splitter → [Mapper-1, Mapper-2, ...] → Reducer → Result

CLI Usage

Running Architectures

# List available architectures
python -m claude_agent_framework.cli --list

# Run with specific architecture
python -m claude_agent_framework.cli --arch research -q "Analyze AI market trends"

# Interactive mode
python -m claude_agent_framework.cli --arch pipeline -i

# Choose model
python -m claude_agent_framework.cli --arch debate -m sonnet -q "Should we use microservices?"

Session Observability (New in v0.4.0)

# View session metrics
claude-agent metrics <session-id>
# Shows: duration, token usage, cost, agent/tool statistics

# Open interactive dashboard
claude-agent view <session-id>
# Opens browser with timeline, tool graphs, performance analysis

# Generate HTML report
claude-agent report <session-id> --output report.html
# Creates comprehensive session report with charts

Python API

Basic Usage

from claude_agent_framework import init

session = init("research")

async for msg in session.run("Research quantum computing applications"):
    print(msg)

With Options

session = init(
    "pipeline",
    model="sonnet",      # haiku, sonnet, or opus
    verbose=True,        # Enable debug logging
    log_dir="./logs",    # Custom log directory
)

Single Query

from claude_agent_framework import quick_query
import asyncio

# Quick one-off query
results = asyncio.run(quick_query("Analyze Python trends", architecture="research"))
print(results[-1])

Custom Architecture

from claude_agent_framework import register_architecture, BaseArchitecture

@register_architecture("my_custom")
class MyCustomArchitecture(BaseArchitecture):
    name = "my_custom"
    description = "Custom workflow for my use case"

    def get_agents(self):
        return {...}

    async def execute(self, prompt, tracker=None, transcript=None):
        # Implementation
        ...

Using Plugins (New in v0.4.0)

from claude_agent_framework import init
from claude_agent_framework.plugins.builtin import (
    MetricsCollectorPlugin,
    CostTrackerPlugin,
    RetryHandlerPlugin
)

session = init("research")

# Add metrics tracking
metrics_plugin = MetricsCollectorPlugin()
session.architecture.add_plugin(metrics_plugin)

# Add cost tracking with budget limit
cost_plugin = CostTrackerPlugin(budget_usd=5.0)
session.architecture.add_plugin(cost_plugin)

# Add automatic retry on errors
retry_plugin = RetryHandlerPlugin(max_retries=3)
session.architecture.add_plugin(retry_plugin)

# Run session
async for msg in session.run("Analyze market"):
    print(msg)

# Get metrics
metrics = metrics_plugin.get_metrics()
print(f"Cost: ${metrics.estimated_cost_usd:.4f}")
print(f"Tokens: {metrics.tokens.total_tokens}")

Advanced Configuration (New in v0.4.0)

from claude_agent_framework.config import ConfigLoader, FrameworkConfigSchema

# Load from YAML
config = ConfigLoader.from_yaml("config.yaml")

# Load with environment profile
config = ConfigLoader.load_with_profile("production")

# Override with environment variables
config = ConfigLoader.from_env(prefix="CLAUDE_")

# Validate configuration
from claude_agent_framework.config import ConfigValidator
errors = ConfigValidator.validate_config(config)
if errors:
    print(f"Configuration errors: {errors}")

Dynamic Agent Registration (New in v0.4.0)

session = init("specialist_pool")

# Add new agent at runtime
session.architecture.add_agent(
    name="security_expert",
    description="Cybersecurity specialist",
    tools=["WebSearch", "Read"],
    prompt="You are a cybersecurity expert...",
    model="sonnet"
)

# List all agents (static + dynamic)
agents = session.architecture.list_dynamic_agents()
print(f"Dynamic agents: {agents}")

Output

Each session generates:

logs/session_YYYYMMDD_HHMMSS/transcript.txt - Human-readable conversation log
logs/session_YYYYMMDD_HHMMSS/tool_calls.jsonl - Structured tool call records
files/<architecture>/ - Architecture-specific outputs (reports, charts, etc.)

Installation Options

# Basic installation
pip install claude-agent-framework

# With PDF generation support
pip install "claude-agent-framework[pdf]"

# With chart generation support
pip install "claude-agent-framework[charts]"

# With advanced configuration (Pydantic, YAML) - New in v0.4.0
pip install "claude-agent-framework[config]"

# With metrics export (Prometheus) - New in v0.4.0
pip install "claude-agent-framework[metrics]"

# With visualization (Matplotlib, Jinja2) - New in v0.4.0
pip install "claude-agent-framework[viz]"

# Full installation (all features)
pip install "claude-agent-framework[all]"

# Development installation
pip install "claude-agent-framework[dev]"

Project Structure

claude_agent_framework/
├── init.py              # Simplified initialization
├── cli.py               # Command-line interface
├── config/              # Configuration system (v0.4.0)
│   ├── schema.py        # Pydantic validation models
│   ├── loader.py        # Multi-source config loading
│   ├── validator.py     # Configuration validation
│   └── profiles/        # Environment configs (dev/staging/prod)
├── core/                # Core abstractions
│   ├── base.py          # BaseArchitecture class
│   ├── session.py       # AgentSession management
│   └── registry.py      # Architecture registry
├── plugins/             # Plugin system (v0.4.0)
│   ├── base.py          # BasePlugin, PluginManager
│   └── builtin/         # Built-in plugins
│       ├── metrics_collector.py
│       ├── cost_tracker.py
│       └── retry_handler.py
├── metrics/             # Performance tracking (v0.4.0)
│   ├── collector.py     # Metrics collection
│   └── exporter.py      # JSON/CSV/Prometheus export
├── dynamic/             # Dynamic agent registry (v0.4.0)
│   ├── agent_registry.py
│   ├── loader.py
│   └── validator.py
├── observability/       # Observability tools (v0.4.0)
│   ├── logger.py        # Structured logging
│   ├── visualizer.py    # Session visualization
│   └── debugger.py      # Interactive debugging
├── architectures/       # Built-in architectures
│   ├── research/        # Research pattern
│   ├── pipeline/        # Pipeline pattern
│   ├── critic_actor/    # Critic-actor pattern
│   ├── specialist_pool/ # Specialist pool pattern
│   ├── debate/          # Debate pattern
│   ├── reflexion/       # Reflexion pattern
│   └── mapreduce/       # MapReduce pattern
├── utils/               # Utility modules
│   ├── tracker.py       # Hook tracking
│   ├── transcript.py    # Logging
│   └── message_handler.py
├── files/               # Working directory
└── logs/                # Session logs

Development

# Clone and install
git clone https://github.com/your-org/claude-agent-framework
cd claude-agent-framework
pip install -e ".[all]"

# Run tests
make test

# Format code
make format

# Lint
make lint

Makefile Commands

make run              # Run default architecture (research)
make run-research     # Run Research architecture
make run-pipeline     # Run Pipeline architecture
make run-critic       # Run Critic-Actor architecture
make run-specialist   # Run Specialist Pool architecture
make run-debate       # Run Debate architecture
make run-reflexion    # Run Reflexion architecture
make run-mapreduce    # Run MapReduce architecture
make list-archs       # List all architectures
make test             # Run tests
make format           # Format code
make lint             # Lint code

Documentation

Quick Reference

README (Chinese) - 中文文档
Best Practices Guide - Pattern selection and implementation tips
Best Practices (Chinese) - 最佳实践指南（中文）

Requirements

Python 3.10+
Claude Agent SDK
ANTHROPIC_API_KEY environment variable

License

MIT License - see LICENSE for details.

Contributing

Contributions welcome! Please read CONTRIBUTING.md for guidelines.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.0

Dec 27, 2025

This version

0.4.0

Dec 25, 2025

0.3.0

Dec 25, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_agent_framework-0.4.0.tar.gz (410.0 kB view details)

Uploaded Dec 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

claude_agent_framework-0.4.0-py3-none-any.whl (145.7 kB view details)

Uploaded Dec 25, 2025 Python 3

File details

Details for the file claude_agent_framework-0.4.0.tar.gz.

File metadata

Download URL: claude_agent_framework-0.4.0.tar.gz
Upload date: Dec 25, 2025
Size: 410.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.12

File hashes

Hashes for claude_agent_framework-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`7a4efb050634bcdd54d2248950d0164c5f079ef7d6a708b1d899f64d34493eb0`
MD5	`36410b0b417d80dab53971dda2029401`
BLAKE2b-256	`d0c4e8d326d5642e3d07560fce8c787ba2c3b4f1f95a4a4493731cc47cc03e4a`

See more details on using hashes here.

File details

Details for the file claude_agent_framework-0.4.0-py3-none-any.whl.

File metadata

Download URL: claude_agent_framework-0.4.0-py3-none-any.whl
Upload date: Dec 25, 2025
Size: 145.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.12

File hashes

Hashes for claude_agent_framework-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8f2d80af8c73fdb2728bf01f55c5ddff74814b74ef69712408dcd309f1248f15`
MD5	`e5c1fd6d4c01787944d36bc2c9508d0e`
BLAKE2b-256	`585e6ca25744a9fda1f63d4181073d0485880dbc94452c455d1c0e45079d41ed`

See more details on using hashes here.

claude-agent-framework 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Claude Agent Framework

Overview

Design Philosophy

Why Multi-Agent?

Core Architecture

Design Principles

Orchestration Patterns

Quick Start

Available Architectures

Production Examples

Example Overview

Design Highlights

1. Competitive Intelligence (Research Architecture)

2. PR Code Review (Pipeline Architecture)

3. Marketing Content (Critic-Actor Architecture)

4. IT Support (Specialist Pool Architecture)

5. Tech Decision (Debate Architecture)

6. Code Debugger (Reflexion Architecture)

7. Codebase Analysis (MapReduce Architecture)

Common Implementation Patterns

Getting Started

Running Examples

Architecture Diagrams

Research Architecture

Pipeline Architecture

Critic-Actor Architecture

Specialist Pool Architecture

Debate Architecture

Reflexion Architecture

MapReduce Architecture

CLI Usage

Running Architectures

Session Observability (New in v0.4.0)

Python API

Basic Usage

With Options

Single Query

Custom Architecture

Using Plugins (New in v0.4.0)

Advanced Configuration (New in v0.4.0)

Dynamic Agent Registration (New in v0.4.0)

Output

Installation Options

Project Structure

Development

Makefile Commands

Documentation

Quick Reference

Architecture & Design (New in v0.4.0)

Customization Guides (New in v0.4.0)

Advanced Topics (New in v0.4.0)

API Reference (New in v0.4.0)

Requirements

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata