Multi-agent orchestration platform over MCP servers
Project description
๐ฅ AgentForge-AI โ Multi-Agent Orchestration over MCP
Production-grade AI agent platform that decomposes natural-language tasks into specialized agent workflows โ with human-in-the-loop approval, structured LLM output, and full observability.
๐ฌ Demo
One command triggers the full pipeline:
agentforge run "Triage all open bugs and alert the team"
What happens:
- ๐ง Orchestrator decomposes the task via LLM
- ๐ TriageAgent fetches open GitHub issues
- โก LLM classifies severity (critical/high/medium/low)
- ๐ท๏ธ Labels applied on GitHub automatically
- ๐ Notion triage report generated
- ๐ฌ Slack alert sent to #engineering
๐๏ธ Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CLI / User โ
โ "Triage all open bugs" โ
โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโผโโโโโโโโโ
โ Orchestrator โ LLM decomposes task
โ (Task Planner) โ into subtasks with
โ โ confidence routing
โโโโโฌโโโโโฌโโโโโฌโโโโ
โ โ โ
โโโโโโโโโโ โ โโโโโโโโโโ
โผ โผ โผ
โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ
โ DevAgent โ โ TriageAg โ โ StandupAgโ
โ โ โ โ โ โ
โ GitHub โ โ Classify โ โ Activity โ
โ Issues โ โ + Label โ โ Summary โ
โโโโโโฌโโโโโโ โโโโโโฌโโโโโโ โโโโโโฌโโโโโโ
โ โ โ
โผ โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MCP Server Layer โ
โ GitHub โ Notion โ Slack โ
โ (REST) โ (REST) โ (REST) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key Components
| Component | Description |
|---|---|
| Orchestrator | LLM-powered task decomposer with confidence-based routing and keyword fallback |
| BaseAgent | Abstract base with approval gates for destructive operations |
| TriageAgent | Fetches GitHub issues โ LLM classifies severity โ labels on GitHub โ Notion report โ Slack alert |
| StandupAgent | Fetches GitHub activity โ LLM generates standup โ posts to Notion + Slack |
| DevAgent | Creates issues, lists repos via GitHub API |
| EvalEngine | Logs predictions to JSONL, computes precision/recall per severity label |
| MCP Servers | GitHub, Notion, Slack โ all with circuit breaker + retry via tenacity |
โก Quick Start
1. Install AgentForge
Option A: Install via PyPI (Recommended)
pip install agentforge-ai
Option B: Install from Source
git clone https://github.com/OmRajput17/AgentForge-AI.git
cd AgentForge-AI
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # macOS/Linux
pip install -e .
2. Initialize Config
agentforge init
This creates ~/.agentforge/config.yml. Open it in your editor:
notepad %USERPROFILE%\.agentforge\config.yml # Windows
nano ~/.agentforge/config.yml # macOS/Linux
Full config.yml:
# โโ LLM Provider โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
llm:
provider: groq # 'openai' or 'groq'
model: llama-3.3-70b-versatile # or 'gpt-4o' for OpenAI
api_key: '' # your API key
# โโ MCP Server Connections โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
mcp_servers:
github_token: '' # GitHub PAT (required)
github_owner: '' # GitHub username
github_repo: '' # target repository
notion_token: '' # Notion integration secret (optional)
notion_page_id: '' # Notion page ID for reports (optional)
slack_token: '' # Slack bot token (optional)
slack_channel: general # Slack channel for alerts
# โโ Behavior โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
auto_approve: false # skip approval prompts for destructive ops
confidence_threshold: 0.8 # min confidence for agent routing
max_iterations: 10 # max subtasks per run
standup_lookback_hours: 24 # how far back to fetch GitHub activity
Supports OpenAI and Groq โ switch providers by changing
providerandmodel. No code changes needed.
3. Run a Task
# Triage bugs โ classify, label, report, alert
agentforge run "Triage all open bugs and alert the team"
# Generate daily standup from GitHub activity
agentforge run "Generate daily standup for om"
# Create a GitHub issue via natural language
agentforge run "Create a GitHub issue for the login bug"
4. Check MCP Server Status
agentforge server
GitHub โ
configured
Notion โ
configured
Slack โ
configured
๐งช Running Tests (No API Keys Required)
All tests are fully mocked โ they run without any API keys or network access.
# Run the full test suite
pytest agentforge/tests/ -v
# Run specific test modules
pytest agentforge/tests/test_schemas.py -v # Pydantic schema validation
pytest agentforge/tests/test_triage_agent.py -v # TriageAgent unit tests
pytest agentforge/tests/test_standup_agent.py -v # StandupAgent unit tests
pytest agentforge/tests/test_dev_agent.py -v # DevAgent unit tests
pytest agentforge/tests/test_mcp_github.py -v # GitHub MCP server tests
# Run with coverage
pytest agentforge/tests/ -v --cov=agentforge --cov-report=term-missing
Test Suite Overview
| Test Module | Tests | What It Covers |
|---|---|---|
test_schemas.py |
17 | Pydantic validation, severity normalization, wontfix, model_dump |
test_triage_agent.py |
12 | Classification, fallback, report generation, approval gate, Slack alerts |
test_standup_agent.py |
7 | Event summarization, standup generation, full workflow |
test_dev_agent.py |
4 | GitHub issue creation, listing, unknown action, LLM failure |
test_mcp_github.py |
โ | GitHub API wrapper with mocked HTTP |
test_mcp_notion.py |
โ | Notion API wrapper |
test_mcp_slack.py |
โ | Slack API wrapper |
test_eval_engine.py |
โ | Prediction logging, precision/recall metrics |
๐ Production Hardening
What makes this production-ready:
-
๐ก๏ธ Human-in-the-Loop Approval โ Destructive agents (TriageAgent, DevAgent) require explicit user approval before mutating GitHub. Powered by
BaseAgent.run()โApprovalGate.ask(). -
๐ Pydantic Structured Output โ No raw
json.loads()anywhere. All LLM responses go throughwith_structured_output(Schema)with field validators that normalize and sanitize data. -
๐ Async-First Architecture โ All blocking MCP calls wrapped in
asyncio.to_thread(). Orchestrator runs parallel subtasks viaasyncio.gather(). -
๐ฅ Graceful Degradation โ Every LLM call has
try/exceptwith safe fallback responses. Notion/Slack failures are non-fatal. Missing fields handled with.get()defaults. -
๐ Pluggable LLM Provider โ
get_llm()factory returns OpenAI or Groq based on config. Switch providers without touching any agent code. -
๐ EvalEngine Observability โ Every triage prediction is logged to
~/.agentforge/evals.jsonlwith confidence scores. Compute precision/recall per severity label across runs. -
๐ Circuit Breaker + Retry โ All MCP servers inherit from
BaseMCPServerwithtenacityretry logic and thread-safe circuit breaker for resilient API calls.
๐ TriageAgent Workflow
โโโโ Step 1: Fetch โโโโ
โ github.list_issues() โ โ asyncio.to_thread()
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโโโ
โ Step 2: Classify โ
โ LLM batch call โ โ with_structured_output(TriageResponse)
โ (all issues at once) โ try/except โ fallback to "low"
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโโโ
โ Step 3: EvalEngine โ
โ Log predictions โ โ confidence scores + run_id
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโโโ
โ Step 4: Approval โ
โ "Label 5 issues?" โ โ ApprovalGate.ask() โ human confirms
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโโโ
โ Step 5: Apply Labels โ
โ github.add_labels() โ โ asyncio.to_thread(), per-issue resilience
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโโโ
โ Step 6: Notion โ
โ Create triage report โ โ Full severity breakdown (non-fatal)
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโโโ
โ Step 7: Slack โ
โ Alert #engineering โ โ Critical/High/Medium/Low counts (non-fatal)
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโโโ
โ Step 8: Eval Report โ
โ Print metrics โ โ Accuracy, precision, recall per label
โโโโโโโโโโโโโโโโโโโโโโโโ
๐ Project Structure
AgentForge/
โโโ agentforge/
โ โโโ agents/
โ โ โโโ base.py # BaseAgent ABC โ approval gates, logging
โ โ โโโ dev_agent.py # GitHub operations agent
โ โ โโโ triage_agent.py # Bug triage workflow agent
โ โ โโโ standup_agent.py # Daily standup generator
โ โ โโโ schemas.py # Pydantic models for structured LLM output
โ โโโ mcp/
โ โ โโโ base.py # BaseMCPServer โ circuit breaker + retry
โ โ โโโ github_server.py # GitHub REST API wrapper
โ โ โโโ notion_server.py # Notion API wrapper
โ โ โโโ slack_server.py # Slack API wrapper
โ โโโ graph/
โ โ โโโ state.py # AgentForgeState TypedDict
โ โโโ tests/ # Full test suite (all mocked, no API keys needed)
โ โโโ orchestrator.py # LLM task decomposer + parallel execution
โ โโโ eval_engine.py # Prediction logging + precision/recall
โ โโโ approval.py # Human-in-the-loop approval gate
โ โโโ config.py # YAML config + LLM factory + Pydantic settings
โ โโโ logger.py # Rich console logger with agent colors
โ โโโ cli.py # Typer CLI entrypoint
โโโ pyproject.toml
โโโ requirements.txt
โโโ README.md
๐ ๏ธ Tech Stack
| Technology | Purpose |
|---|---|
| Python 3.11+ | Core runtime |
| LangChain | LLM orchestration with structured output |
| Groq / OpenAI | Pluggable LLM providers via get_llm() factory |
| Pydantic v2 | Schema validation, field normalization |
| asyncio | Async execution, to_thread for blocking MCP calls |
| httpx | HTTP client for MCP server API calls |
| tenacity | Retry logic + circuit breaker for API resilience |
| Rich | Beautiful terminal UI, colored logs, approval prompts |
| Typer | CLI framework |
| pytest + pytest-asyncio | Async-aware testing |
๐ License
MIT License โ see LICENSE for details.
Built with โค๏ธ by Om Rajput
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentforge_ai-0.1.0.tar.gz.
File metadata
- Download URL: agentforge_ai-0.1.0.tar.gz
- Upload date:
- Size: 34.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfc2101f76f4ad43bcc3c6a27c5e52b249cea2f48c75f9fc9935126d3d0f3ff0
|
|
| MD5 |
a44f9c434db0c4816653347487590b4a
|
|
| BLAKE2b-256 |
23ec23314e8948a447b7a1045448c3cd9d42f45bc792a458b5cbc4c013f99b0e
|
File details
Details for the file agentforge_ai-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agentforge_ai-0.1.0-py3-none-any.whl
- Upload date:
- Size: 41.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d646ed8ead289a63d5520c98c97c142cb8960a3fa81a3ba023387507dae44f37
|
|
| MD5 |
d52edd56864f01a8e6426b59510d876a
|
|
| BLAKE2b-256 |
fe27b42f0dab8546eb5fb21a8432ccb32e55326cc1ae6396c221fd26d604aec7
|