Multi-Role AI Task Orchestrator — one task in, multi-role AI collaboration, one conclusion out (Production Ready)
Project description
DevSquad — Multi-Role AI Task Orchestrator
One task → Multi-role AI collaboration → One conclusion
Production Ready | V3.6.1
🚀 V3.6.1: Cybernetics Enhancement Release
DevSquad V3.6.1 adds 5 new cybernetics modules: FeedbackControlLoop for closed-loop feedback control, ExecutionGuard for safe execution with rollback, PerformanceFingerprint for performance baseline tracking, SimilarTaskRecommender for TF-IDF-based task similarity search, and AdaptiveRoleSelector for intelligent role selection based on task characteristics — making multi-agent collaboration more adaptive, self-optimizing, and resilient.
🎯 Quick Start (4 Ways to Use DevSquad)
0️⃣ First Time? Start Here!
# Interactive setup wizard (1-2 minutes)
python scripts/cli.py init
# Then start collaborating!
devsquad dispatch -t "your task description"
1️⃣ Interactive Web Dashboard (Recommended)
# Start Streamlit dashboard with authentication
streamlit run scripts/dashboard.py
# Open http://localhost:8501
# Login with: admin / admin123
2️⃣ REST API Server
# Install dependencies
pip install fastapi uvicorn
# Start API server
uvicorn scripts.api_server:app --host 0.0.0.0 --port 8000 --reload
# Access Swagger UI: http://localhost:8000/docs
# Access ReDoc: http://localhost:8000/redoc
3️⃣ Command Line Interface
# Standard CLI usage
python scripts/cli.py lifecycle build
# Enhanced visual output
python scripts/cli.py lifecycle build --visual --verbose
🏗️ Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ User Access Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Streamlit │ │ FastAPI REST │ │ CLI/Notebook │ │
│ │ Dashboard │ │ API Server │ │ (Existing) │ │
│ │ (Auth+HTTPS) │ │ (Swagger) │ │ │ │
│ └──────┬───────┘ └──────┬───────┘ └──────────────┘ │
└─────────┼───────────────┼───────────────────────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ Business Logic Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │AuthManager │ │AlertManager │ │HistoryMgr │ │
│ │(RBAC Auth) │ │(Multi-Chnl) │ │(SQLite TSDB)│ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ ┌─────────────────────────────────────────────┐ │
│ │ LifecycleProtocol (11-Phase Engine) │ │
│ │ UnifiedGateEngine + CheckpointManager │ │
│ └─────────────────────────────────────────────┘ │
└─────────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Data Persistence Layer │
│ ┌────────────┐ ┌────────────┐ ┌────────────────────────┐ │
│ │ SQLite DB │ │ YAML Config│ │ Checkpoint Files │ │
│ │ (History) │ │ (Deploy) │ │ (Lifecycle State) │ │
│ └────────────┘ └────────────┘ └────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
✨ Key Features (V3.6.1)
⚓ AnchorChecker (NEW)
Milestone anchor verification that ensures critical checkpoints are properly validated before proceeding:
- Anchor Point Definition — Define mandatory validation anchors at key lifecycle milestones
- Cross-Phase Verification — Verify consistency between phase outputs and anchor criteria
- Drift Detection — Detect when project execution drifts from defined anchor points
- Auto-Recovery — Suggest corrective actions when anchor checks fail
🔄 RetrospectiveEngine (NEW)
Independent retrospective mechanism for continuous improvement after each dispatch cycle:
- Post-Dispatch Review — Automatically analyze what went well and what could improve
- Pattern Extraction — Extract reusable patterns from successful collaborations
- Anti-Pattern Detection — Identify recurring issues and suggest process improvements
- Metric Trend Analysis — Track quality metrics across dispatches to spot degradation
📊 FeatureUsageTracker (NEW)
Thread-safe feature invocation counter for data-driven feature optimization:
- Invocation Tracking — Count every feature call (dispatch, anchor_check, retrospective, consensus, etc.)
- Usage Reports — Top features, unused features, low-usage features with markdown export
- Auto-Persist — Periodic JSON persistence every 100 ticks
- 30 Known Features — Pre-registered feature set covering all DevSquad capabilities
🎯 StructuredGoal (NEW)
Structured goal management that decomposes high-level objectives into trackable, verifiable sub-goals:
- Goal Decomposition — Break complex objectives into hierarchical sub-goals with clear criteria
- Progress Tracking — Real-time progress measurement against defined goal structure
- Dependency Mapping — Visualize and manage dependencies between sub-goals
- Completion Verification — Automated verification that goals meet their success criteria
🔀 FallbackBackend (NEW)
Automatic backend failover that ensures LLM availability even when primary backends are down:
- Health Monitoring — Continuous health checks for all configured LLM backends
- Automatic Failover — Seamlessly switch to backup backend when primary fails
- Priority-Based Routing — Configure backend priority order (e.g., OpenAI → Anthropic → Mock)
- Recovery Detection — Automatically restore primary backend when it recovers
🔐 Authentication & Authorization
- Multi-user support with role-based access control (RBAC)
- Three roles: Admin (full access), Operator (execute), Viewer (read-only)
- Secure password hashing with SHA-256
- Session management for Streamlit dashboard
- OAuth2 support (optional, for enterprise deployments)
🌐 REST API (FastAPI)
- 10+ endpoints for complete lifecycle management
- Automatic OpenAPI/Swagger documentation at
/docs - CORS middleware for cross-origin requests
- Request timing and comprehensive logging
- Standardized error responses
Key Endpoints:
Lifecycle:
GET /api/v1/lifecycle/phases → List all 11 phases
POST /api/v1/lifecycle/actions → Execute phase actions
GET /api/v1/lifecycle/status → Current status
Metrics:
GET /api/v1/metrics/current → Real-time metrics
GET /api/v1/metrics/history → Historical data
Gates:
GET /api/v1/gates/status → All gate statuses
POST /api/v1/gates/check → Check specific gate
System:
GET /api/v1/health → Health check
🔔 Alert Notification System
- 4 severity levels: INFO, WARNING, ERROR, CRITICAL
- Multiple channels: Console, Slack, Email, Webhook
- Rate limiting to prevent alert spam (configurable)
- Deduplication within time window
- Alert history tracking and statistics
📊 Historical Data Storage (SQLite)
- Metrics snapshots with time-range queries
- Alert history with acknowledgment tracking
- API request logs with performance metrics
- Lifecycle events audit trail
- Automatic cleanup with configurable retention
📈 Visualization & Monitoring
- Streamlit Dashboard: Real-time monitoring with authentication
- CLI Visual Module: Rich terminal output with colors and icons
- Jupyter Notebook: Interactive 10-section tutorial
- Benchmark Reports: HTML/JSON performance reports
🧩 Layered Sub-Skill Architecture (V3.6.1)
DevSquad provides 6 atomic sub-skills that can be used independently or together. Each sub-skill is a thin wrapper (~50 lines) importing existing core modules — no duplicated logic.
skills/
├── dispatch/ → DispatchSkill — MultiAgentDispatcher (7-role orchestration)
├── intent/ → IntentSkill — IntentWorkflowMapper (6 intents × 3 languages)
├── review/ → ReviewSkill — FiveAxisConsensusEngine (5-axis code review)
├── security/ → SecuritySkill — InputValidator + OperationClassifier + PermissionGuard
├── test/ → TestSkill — TestQualityGuard + test strategy generation
└── retrospective/ → RetroSkill — RetrospectiveEngine + pattern extraction
Sub-Skill Quick Reference
| Skill | Core Method | Wraps | Mock Mode |
|---|---|---|---|
dispatch |
run(task, roles, mode) |
MultiAgentDispatcher | ✅ |
intent |
detect(text, lang) |
IntentWorkflowMapper | ✅ |
review |
review(code) |
FiveAxisConsensusEngine | ✅ |
security |
scan_input(text) |
InputValidator + OpClassifier | ✅ |
test |
generate_strategy(module) |
TestQualityGuard | ✅ |
retrospective |
run_retrospective(results) |
RetrospectiveEngine | ✅ |
Usage Examples
# Direct import (recommended for single skill)
from skills.dispatch.handler import DispatchSkill
result = DispatchSkill().run("Fix login bug", roles=["coder", "tester"])
# Via registry (dynamic discovery)
from skills import get_skill, list_skills
print(list_skills()) # ['dispatch', 'intent', 'review', 'security', 'test', 'retrospective']
skill = get_skill("security")
result = skill.scan_input("DROP TABLE users; --")
All sub-skills work without any API key in Mock mode.
📋 Plan C Architecture (Core Engine)
Unified Lifecycle Architecture - Resolves CLI 6 commands vs 11-phase lifecycle:
CLI View Layer (6 commands) Core Engine (11 phases)
┌─────────────────────┐ ┌──────────────────────────┐
│ spec → P1, P2 │───View ──→│ P1: Requirements │
│ plan → P7 │ Mapping │ P2: Architecture │
│ build → P8 │ │ P3: Technical Design │
│ test → P9 │ │ ... │
│ review → P8,P6 │ │ P10: Deployment │
│ ship → P10 │ │ P11: Operations │
└─────────────────────┘ └──────────────────────────┘
↓ ↓
UnifiedGateEngine CheckpointManager
(Phase + Worker gates) (Lifecycle state persistence)
Core Components:
- ✅ LifecycleProtocol - Abstract interface for unified lifecycle management
- ✅ UnifiedGateEngine - Integrates VerificationGate + Phase transition gates
- ✅ FullLifecycleAdapter - Complete 11-phase lifecycle with dependency resolution
- ✅ Enhanced CheckpointManager - Auto save/restore lifecycle state across sessions
What is DevSquad?
DevSquad transforms a single AI task into a multi-role AI collaboration. It automatically dispatches your task to the right combination of expert roles — architect, product manager, coder, tester, security reviewer, DevOps — orchestrates their parallel collaboration through a shared workspace, resolves conflicts via weighted consensus voting, and delivers a unified structured report.
You: "Design a microservices e-commerce backend"
│
▼
┌─────────────────┐
│ InputValidator ──→ Security check (XSS, SQL injection, prompt injection)
└────────┬────────┘
▼
┌─────────────────┐
│ RoleMatcher ──→ Auto-match: architect + devops + security
└────────┬────────┘
▼
┌──────────┬──────────┬──────────┐
│ Architect │ DevOps │ Security │ ← ThreadPoolExecutor parallel execution
│(Design) │(Infra) │(Threat) │
└────┬──────┴────┬─────┴────┬────┘
└────────────┼───────────┘
▼
┌──────────────────┐
│ Scratchpad │ ← Shared blackboard (real-time sync)
└────────┬─────────┘
▼
┌──────────────────┐
│ Consensus Engine │ ← Weighted vote + veto + escalation
└────────┬─────────┘
▼
┌──────────────────┐
│ Structured Report │ ← Findings + Action Items (H/M/L)
└──────────────────┘
📦 Installation
Prerequisites
- Python 3.9+ (3.9, 3.10, 3.11, 3.12 supported)
- pip or pipenv for package management
Option A: Core Installation (CLI + Dashboard)
git clone https://github.com/your-org/DevSquad.git
cd DevSquad
# Install core package (minimal dependencies)
pip install -e .
# Ready to use!
devsquad dispatch -t "Design user authentication system"
Option B: Full Production Stack (Recommended)
# Clone and install with all production features
git clone https://github.com/your-org/DevSquad.git
cd DevSquad
# Install with API server dependencies
pip install -e ".[api]"
# Or install all optional features
pip install -e ".[all]"
Optional Feature Groups:
# API Server (FastAPI + Uvicorn)
pip install -e ".[api]"
# Visualization (Streamlit + Jupyter)
pip install -e ".[visualization]"
# Alerting (Slack SDK)
pip install -e ".[alerts]"
# Development & Testing
pip install -e ".[dev]"
# Everything combined
pip install -e ".[all]"
Verify Installation
# Check version
devsquad --version
# Expected: devsquad 3.6.1
# Run tests
pytest tests/ -v --tb=short
# Expected: 1500+ passed
3 Ways to Use
1. CLI (Recommended)
# Mock mode (default) — no API key needed
python3 scripts/cli.py dispatch -t "Design user authentication system"
# Real AI output — set environment variables first
export OPENAI_API_KEY="sk-..."
export OPENAI_BASE_URL="https://api.openai.com/v1" # optional
export OPENAI_MODEL="gpt-4" # optional
python3 scripts/cli.py dispatch -t "Design auth system" --backend openai
# Specify roles (short IDs: arch/pm/test/coder/ui/infra/sec)
python3 scripts/cli.py dispatch -t "Design auth system" -r arch sec --backend openai
# Stream output in real-time
python3 scripts/cli.py dispatch -t "Design auth system" -r arch --backend openai --stream
# Other commands
python3 scripts/cli.py status # System status
python3 scripts/cli.py roles # List available roles
python3 scripts/cli.py --version # Show version (3.6.1)
2. Python API
from scripts.collaboration.dispatcher import MultiAgentDispatcher
# Mock mode (default)
disp = MultiAgentDispatcher()
result = disp.dispatch("Design REST API for user management")
print(result.to_markdown())
disp.shutdown()
# With LLM backend
from scripts.collaboration.llm_backend import create_backend
backend = create_backend("openai", api_key="sk-...", base_url="https://api.openai.com/v1")
disp = MultiAgentDispatcher(llm_backend=backend)
result = disp.dispatch("Design auth system", roles=["architect", "security"])
print(result.summary)
disp.shutdown()
4. Sub-Skills (Lightweight Independent)
# Each sub-skill works independently — no Dispatcher needed
from skills.security.handler import SecuritySkill
risk = SecuritySkill().scan_input("malicious input")
from skills.review.handler import ReviewScore
verdict = ReviewSkill().review(code_snippet)
from skills.intent.handler import IntentSkill
intent = IntentSkill().detect("修复登录漏洞", lang="zh")
3. MCP Server (for Cursor / any MCP client)
pip install mcp
python3 scripts/mcp_server.py # stdio mode
python3 scripts/mcp_server.py --port 8080 # SSE mode
Exposes 6 tools: multiagent_dispatch, multiagent_quick, multiagent_roles,
multiagent_status, multiagent_analyze, multiagent_shutdown.
7 Core Roles
| Role | CLI ID | Aliases | Weight | Best For |
|---|---|---|---|---|
| Architect | arch |
architect |
1.5 | System design, tech stack, performance/security architecture |
| Product Manager | pm |
product-manager |
1.2 | Requirements, user stories, acceptance criteria |
| Security Expert | sec |
security |
1.1 | Threat modeling, vulnerability audit, compliance |
| Tester | test |
tester, qa |
1.0 | Test strategy, quality assurance, edge cases |
| Coder | coder |
solo-coder, dev |
1.0 | Implementation, code review, performance optimization |
| DevOps | infra |
devops |
1.0 | CI/CD, containerization, monitoring, infrastructure |
| UI Designer | ui |
ui-designer |
0.9 | UX flow, interaction design, accessibility |
Auto-match: If no roles specified, the dispatcher automatically matches based on task keywords.
Architecture Overview (53 Core Modules)
DevSquad is built on a layered architecture with clear separation of concerns:
┌─────────────────────────────────────────────────┐
│ CLI / MCP / API │ Entry Points
├─────────────────────────────────────────────────┤
│ MultiAgentDispatcher │ Orchestration
│ ┌────────────┬──────────────┬────────────────┐ │
│ │RoleMatcher │ReportFormatter│InputValidator │ │ Extracted Components
│ └────────────┴──────────────┴────────────────┘ │
│ ┌────────────────────────────────────────────┐ │
│ │ RuleCollector (NL Rule Intercept) │ │ Rule Collection
│ └────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────┤
│ Coordinator │ Task Planning
│ ┌──────────┬───────────┬────────────────────┐ │
│ │ Scratchpad│ Consensus │ BatchScheduler │ │ Collaboration
│ └──────────┴───────────┴────────────────────┘ │
├─────────────────────────────────────────────────┤
│ Worker (per role) │ Execution
│ ┌────────────────────────────────────────────┐ │
│ │ PromptAssembler → LLMBackend → Output │ │
│ └────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────┤
│ LLMBackend: Mock | OpenAI | Anthropic │ LLM Layer
├─────────────────────────────────────────────────┤
│ CheckpointManager | WorkflowEngine | ... │ Infrastructure
└─────────────────────────────────────────────────┘
What's New in V3.6.1 🆕
AnchorChecker System
Milestone anchor verification that ensures critical checkpoints are validated before proceeding:
from scripts.collaboration.anchor_checker import AnchorChecker
checker = AnchorChecker()
checker.define_anchor("architecture_complete", criteria=["API spec defined", "tech stack selected"])
result = checker.check_anchor("architecture_complete", phase_output)
print(f"Anchor passed: {result.passed}")
print(f"Drift detected: {result.drift_score}")
Features:
- Cross-phase consistency verification
- Drift detection with severity scoring
- Auto-recovery suggestions
- Anchor point persistence
RetrospectiveEngine
Independent retrospective mechanism for continuous improvement:
from scripts.collaboration.retrospective_engine import RetrospectiveEngine
engine = RetrospectiveEngine()
report = engine.run_retrospective(dispatch_result)
print(f"Patterns found: {len(report.patterns)}")
print(f"Anti-patterns: {len(report.anti_patterns)}")
print(f"Improvement suggestions: {report.suggestions}")
Features:
- Post-dispatch quality analysis
- Pattern and anti-pattern extraction
- Metric trend tracking
- Actionable improvement suggestions
StructuredGoal
Structured goal management with hierarchical decomposition:
from scripts.collaboration.structured_goal import StructuredGoal
goal = StructuredGoal("Build e-commerce platform")
goal.add_sub_goal("User auth", criteria=["OAuth2 support", "2FA ready"])
goal.add_sub_goal("Product catalog", criteria=["Search", "Filter", "Pagination"])
progress = goal.get_progress()
print(f"Overall: {progress.completion_pct}%")
Features:
- Hierarchical goal decomposition
- Dependency mapping between sub-goals
- Real-time progress tracking
- Automated completion verification
FallbackBackend
Automatic LLM backend failover for high availability:
from scripts.collaboration.llm_backend import FallbackBackend
backend = FallbackBackend(
primary="openai",
fallbacks=["anthropic", "mock"],
health_check_interval=30,
)
result = backend.generate("Design auth system")
# Automatically fails over if primary is down
Features:
- Continuous backend health monitoring
- Seamless automatic failover
- Priority-based routing configuration
- Automatic primary recovery detection
Natural Language Rule Collection
Automatically detect and store user rules from natural language input:
# User says: "记住规则:写代码时必须加注释"
# DevSquad automatically:
# 1. Detects rule-storing intent
# 2. Extracts: trigger="写代码时", action="必须加注释", type="always"
# 3. Sanitizes content (removes dangerous patterns)
# 4. Stores via CarryMem or local JSON fallback
# List stored rules
# User says: "列出规则" → Returns all stored rules
# Delete a rule
# User says: "删除规则 RULE-LOCAL-abc123"
Pipeline: User Input → IntentDetector → RuleExtractor → RuleSanitizer → RuleStorage (CarryMem + local JSON)
Features:
- 11 intent patterns (Chinese + English)
- 4 rule types: always / avoid / prefer / forbid
- Prompt injection protection in rule content
- CarryMem primary + local JSON fallback storage
- Automatic rule injection into Worker prompts
See Integration Guide for detailed usage.
Key Features
Security
- InputValidator: XSS, SQL injection, command injection, HTML injection detection
- Prompt Injection Protection: 21+ patterns (ignore previous instructions, jailbreak, DAN mode, system prompt extraction, etc.)
- API Key Safety: Environment variables only, never CLI arguments or logs
- PermissionGuard: 4-level safety gate (PLAN → DEFAULT → AUTO → BYPASS)
Performance
- ThreadPoolExecutor: Real parallel execution for multi-role dispatch
- LLM Cache: TTL-based LRU cache with disk persistence (60-80% cost reduction)
- LLM Retry: Exponential backoff + circuit breaker + multi-backend fallback
- Streaming Output: Real-time chunk-by-chunk LLM output via
--stream
Reliability
- CheckpointManager: SHA256 integrity, handoff documents, auto-cleanup
- WorkflowEngine: Task-to-workflow auto-split, step execution, resume from checkpoint, 11-phase lifecycle templates (full/backend/frontend/internal_tool/minimal), requirement change management
- TaskCompletionChecker: DispatchResult/ScheduleResult completion tracking
- ConsensusEngine: Weighted voting with veto power and human escalation
Project Lifecycle (11-Phase Model)
DevSquad V3.6.1 defines an 11-phase (4 optional) project lifecycle with clear roles, dependencies, and gate conditions:
P1 → P2 ──┬──→ P3 ──→ P6 ──→ P7 ──→ P8 ──→ P9 ──→ P10 ──→ P11
├──→ P4(∥P3) ──↗
└──→ P5(dep P1+P3) ──↗
| Template | Phases | Use Case |
|---|---|---|
full |
P1-P11 | Complete project |
backend |
No P5 | Backend services |
frontend |
No P4,P6 | Frontend applications |
internal_tool |
No P4,P5,P6,P11 | Internal tools |
minimal |
P1,P3,P7,P8,P9 | Minimum set |
See GUIDE.md §4 for full lifecycle details with gate conditions and requirement change process.
Developer Experience
- Configuration File:
.devsquad.yamlin project root with env var overrides - Quality Control Injection: Auto-inject QC rules (hallucination prevention, overconfidence check, security guard, RACI protocol) into Worker prompts based on
.devsquad.yamlconfig - Docker Support:
docker build -t devsquad . && docker run devsquad dispatch -t "task" - GitHub Actions CI: Python 3.9-3.12 matrix testing
- pip installable:
pip install -e .with optional dependencies
Module Reference (53 Modules)
| Module | File | Purpose |
|---|---|---|
| MultiAgentDispatcher | dispatcher.py |
Unified entry point |
| Coordinator | coordinator.py |
Global orchestration: plan → assign → execute → collect |
| Worker | worker.py |
Role executor with LLM backend integration |
| EnhancedWorker | enhanced_worker.py |
Worker with auto QA (briefing + confidence + retry + memory rules) |
| Scratchpad | scratchpad.py |
Shared blackboard for inter-worker communication |
| ConsensusEngine | consensus.py |
Weighted voting + veto + escalation |
| RoleMatcher | role_matcher.py |
Keyword-based role matching with alias resolution |
| ReportFormatter | report_formatter.py |
Structured/compact/detailed report generation |
| InputValidator | input_validator.py |
Security validation + prompt injection detection |
| AISemanticMatcher | ai_semantic_matcher.py |
LLM-powered semantic role matching |
| CheckpointManager | checkpoint_manager.py |
State persistence + handoff documents |
| WorkflowEngine | workflow_engine.py |
Task-to-workflow auto-split + 11-phase lifecycle templates + requirement change |
| TaskCompletionChecker | task_completion_checker.py |
Completion tracking + progress reporting |
| CodeMapGenerator | code_map_generator.py |
Python AST-based code structure analysis |
| DualLayerContextManager | dual_layer_context.py |
Project-level + task-level context management |
| SkillRegistry | skill_registry.py |
Reusable skill registration + discovery |
| IntentWorkflowMapper | intent_workflow_mapper.py |
User intent → workflow chain mapping (6 intents × 3 languages) |
| OperationClassifier | operation_classifier.py |
Three-tier operation classification (ALWAYS_SAFE/NEEDS_REVIEW/FORBIDDEN) |
| FiveAxisConsensusEngine | five_axis_consensus.py |
Five-axis review consensus with weighted voting |
| FeatureUsageTracker | feature_usage_tracker.py |
Feature usage tracking + reporting + auto-persistence |
| LLMBackend | llm_backend.py |
Mock/OpenAI/Anthropic with streaming support |
| LLMCache | llm_cache.py |
TTL-based LRU cache with disk persistence |
| LLMRetry | llm_retry.py |
Exponential backoff + circuit breaker |
| ConfigManager | config_loader.py |
YAML config + env var overrides |
| PromptAssembler | prompt_assembler.py |
Dynamic prompt assembly + QC rule injection |
| AgentBriefing | agent_briefing.py |
Context-aware task briefing with priority filtering |
| ConfidenceScorer | confidence_score.py |
5-factor response quality assessment |
| PerformanceMonitor | performance_monitor.py |
P95/P99 tracking + CPU/memory monitoring |
| MCEAdapter | mce_adapter.py |
CarryMem integration adapter (optional dependency, supports match_rules + format_rules_as_prompt + add_rule) |
| Protocols | protocols.py |
Interface definitions (CacheProvider, MemoryProvider, etc.) |
| NullProviders | null_providers.py |
Graceful degradation providers |
| PermissionGuard | permission_guard.py |
4-level safety gate |
| MemoryBridge | memory_bridge.py |
Cross-session memory |
| BatchScheduler | batch_scheduler.py |
Batch task scheduling |
| ContextCompressor | context_compressor.py |
Context compression for long tasks |
| RoleTemplateMarket | role_template_market.py |
Role template sharing marketplace |
| Skillifier | skillifier.py |
Auto skill learning from tasks |
| UsageTracker | usage_tracker.py |
Token/cost tracking |
| WarmupManager | warmup_manager.py |
Startup warmup optimization |
| TestQualityGuard | test_quality_guard.py |
Test quality enforcement |
| PromptVariantGenerator | prompt_variant_generator.py |
A/B prompt testing |
| ConfigManager (YAML) | config_manager.py |
Project-level YAML config |
| WorkBuddyClawSource | memory_bridge.py |
WorkBuddy read-only bridge |
| Models | models.py |
Shared data models and type definitions |
| LLMCacheAsync | llm_cache_async.py |
Async LLM cache for concurrent workloads |
| LLMRetryAsync | llm_retry_async.py |
Async LLM retry with backoff |
| IntegrationExample | integration_example.py |
DevSquad integration example code |
| AsyncIntegrationExample | async_integration_example.py |
Async DevSquad integration example |
| AnchorChecker | anchor_checker.py |
Milestone anchor verification + drift detection + auto-recovery |
| RetrospectiveEngine | retrospective.py |
Independent post-dispatch retrospective + pattern extraction + anti-pattern detection |
| FeatureUsageTracker | feature_usage_tracker.py |
Feature invocation counter + usage reports + auto-persist |
| FallbackBackend | llm_backend.py |
Automatic LLM backend failover with health monitoring |
Configuration
Create .devsquad.yaml in your project root:
quality_control:
enabled: true
strict_mode: true
min_quality_score: 85
ai_quality_control:
enabled: true
hallucination_check:
enabled: true
require_traceable_references: true
overconfidence_check:
enabled: true
require_alternatives_min: 2
ai_security_guard:
enabled: true
permission_level: "DEFAULT"
ai_team_collaboration:
enabled: true
raci:
mode: "strict"
llm:
backend: openai
base_url: "" # Set via LLM_BASE_URL env var
model: "" # Set via LLM_MODEL env var
timeout: 120
log_level: WARNING
Or use environment variables (higher priority):
export DEVSQUAD_LLM_BACKEND=openai
export DEVSQUAD_BASE_URL=https://api.openai.com/v1
export DEVSQUAD_MODEL=gpt-4
export OPENAI_API_KEY=sk-...
Environment Variables
| Variable | Purpose | Default |
|---|---|---|
OPENAI_API_KEY |
OpenAI API key | None (required for OpenAI backend) |
OPENAI_BASE_URL |
OpenAI-compatible base URL | None |
OPENAI_MODEL |
Model name | gpt-4 |
ANTHROPIC_API_KEY |
Anthropic API key | None (required for Anthropic backend) |
ANTHROPIC_MODEL |
Model name | claude-sonnet-4-20250514 |
DEVSQUAD_LLM_BACKEND |
Default backend type | mock |
DEVSQUAD_LOG_LEVEL |
Logging level | WARNING |
Running Tests
# Core tests (748+ tests all passing)
python3 -m pytest scripts/collaboration/core_test.py \
scripts/collaboration/role_mapping_test.py \
scripts/collaboration/upstream_test.py \
scripts/collaboration/mce_adapter_test.py \
tests/ test_v35_integration.py \
tests/test_anti_rationalization.py \
tests/test_verification_gate.py \
tests/test_intent_workflow_mapper.py \
tests/test_cli_lifecycle.py -v
# Quick smoke test
python3 scripts/cli.py --version # 3.6.1
python3 scripts/cli.py status # System ready
python3 scripts/cli.py roles # List 7 roles
# Lifecycle commands (NEW in v3.4.1)
python3 scripts/cli.py spec -t "User authentication system"
python3 scripts/cli.py build -t "Implement login API"
python3 scripts/cli.py test -t "Run all unit tests"
python3 scripts/cli.py review -t "Check PR #123"
python3 scripts/cli.py ship -t "Deploy to production"
Documentation
| Document | Description |
|---|---|
| QUICK_START_EN.md | ⚡ Quick start guide (English, 5 minutes) |
| REFERENCE_GUIDE_EN.md | 📖 Complete reference guide (English) |
| QUICK_START_JP.md | ⚡ クイックスタートガイド (日本語, 5分) |
| REFERENCE_GUIDE_JP.md | 📖 完全リファレンスガイド (日本語) |
| GUIDE.md | Complete user guide (Chinese) |
| GUIDE_EN.md | |
| GUIDE_JP.md | |
| INSTALL.md | Installation guide (Unix + Windows) |
| EXAMPLES.md | Real-world usage examples |
| SKILL.md | Skill manual (EN/CN/JP) |
| CLAUDE.md | Claude Code project instructions |
| CHANGELOG.md | Version history |
| README-CN.md | 中文说明 |
| README-JP.md | 日本語説明 |
🆕 Quick Start (Recommended for New Users)
New to DevSquad? Start here:
# 1. Run the interactive demo (3 scenarios, < 15 seconds)
python examples/quick_demo.py
# 2. Read the quick start guide
# English: docs/i18n/QUICK_START_EN.md
# Japanese: docs/i18n/QUICK_START_JP.md
# 3. Your first dispatch
python3 scripts/cli.py dispatch -t "Design user authentication system"
☸️ Kubernetes Deployment
# Deploy with Helm
helm install devsquad ./helm/devsquad
# Port forward
kubectl port-forward svc/devsquad-api 8000:8000
See helm/devsquad/README.md for full documentation.
Cross-Platform Compatibility
DevSquad is not TRAE-exclusive. It supports 6 integration methods:
| Platform | Integration | Setup Difficulty | Key Features Available |
|---|---|---|---|
| TRAE IDE | Native Skill (skill-manifest.yaml) |
⭐ Zero config | Full: Dispatcher + Dashboard + CLI |
| Claude Code | MCP Server / Python import | ⭐ Low | 6 MCP tools or direct API |
| Cursor | MCP Server (stdio mode) |
⭐ Low | Same as Claude Code |
| OpenClaw / WorkBuddy Claw | WorkBuddyClawSource bridge |
Auto | Read-only memory bridge |
| Any MCP Client | stdio / SSE dual mode | ⭐ Low | 6 tools, configurable port |
| Pure Python | pip install -e . |
⭐ Low | CLI + API + Skills + REST |
| Docker | docker build & run |
⭐ Low | Isolated container with all features |
Quick Start per Platform
# === TRAE IDE ===
# Just use it — zero configuration
# === Claude Code / Cursor (MCP) ===
# Add to .claude/mcp.json or .cursor/mcp.json:
# {"mcpServers": {"devsquad": {"command": "python", "args": ["/path/to/mcp_server.py"]}}}
# === Pure Python ===
pip install -e "/path/to/DevSquad[all]"
devsquad dispatch -t "task description"
# === REST API ===
uvicorn scripts.api_server:app --port 8000 # → http://localhost:8000/docs
# === Docker ===
docker build -t devsquad . && docker run -it devsquad dispatch -t "test"
Version History
| Date | Version | Highlights |
|---|---|---|
| 2026-05-17 | V3.6.1 | 🔄 Cybernetics Enhancement — 5 new modules (FeedbackControlLoop/ExecutionGuard/PerformanceFingerprint/SimilarTaskRecommender/AdaptiveRoleSelector) with feedback loops, execution guards, TF-IDF similarity search, and adaptive role selection. Inspired by upstream TraeMultiAgentSkill v2.5's cybernetics architecture. |
| 2026-05-16 | V3.6.0 | 🧩 Layered Sub-Skill Architecture — 6 atomic sub-skills (dispatch/intent/review/security/test/retrospective) with lazy-loading registry via importlib, each ~50 lines wrapping existing core modules, no duplicated logic. All sub-skills work in Mock mode without API keys. Plus: Cross-platform compatibility docs updated for Claude Code/Cursor/OpenClaw/Pure Python/Docker/MCP. |
| 2026-05-13 | V3.6.0 | ⚓ AnchorChecker (milestone anchor verification + drift detection), RetrospectiveEngine (independent retrospective + pattern extraction), StructuredGoal (structured goal decomposition + progress tracking), FallbackBackend (automatic LLM failover + health monitoring), FeatureUsageTracker (feature usage tracking + reporting + auto-persistence), 7 module integrations (IntentWorkflowMapper/AISemanticMatcher/DualLayerContextManager/OperationClassifier/SkillRegistry/FiveAxisConsensusEngine/NullProviders), 1548+ tests, 48 core modules |
| 2026-05-05 | V3.5.0 | 📋 Enhancement Sprint — Code walkthrough enhancement, documentation consistency checks, Karpathy principles, project understanding (AgentBriefing), CLI lifecycle commands, structured output, 748+ tests |
| 2026-05-03 | V3.4.1 | 🚀 Agent Skills Quality Framework (P0) — AntiRationalizationEngine + VerificationGate + IntentWorkflowMapper + CLI Lifecycle Commands (spec/plan/build/test/review/ship) + 167 new tests + Google Agent Skills integration + 49 core modules |
| 2026-05-02 | V3.4.0 | 🆕 11-Phase Project Lifecycle (full/backend/frontend/internal_tool/minimal templates), requirement change management, gate mechanism with gap reporting, 748+ tests passing, WorkflowEngine lifecycle support |
| 2026-05-01 | V3.4.0 | AgentBriefing (context-aware task briefing), ConfidenceScore (5-factor quality assessment), EnhancedWorker (auto quality assurance with retry + memory_provider rule injection), Protocol interface system (match_rules/format_rules_as_prompt), CarryMem v0.2.8+ integration, comprehensive documentation |
| 2026-04-27 | V3.4.0 | Real LLM backend (OpenAI/Anthropic/Mock), ThreadPoolExecutor parallel execution, InputValidator + prompt injection protection, CheckpointManager, WorkflowEngine, TaskCompletionChecker, AISemanticMatcher, streaming output, Docker, GitHub Actions CI, config file, CodeMapGenerator, DualLayerContext, SkillRegistry, CarryMem integration, 234 unit tests |
| 2026-04-17 | V3.2 | E2E Demo, MCE Adapter, Dispatcher UX |
| 2026-04-16 | V3.0 | Complete redesign — Coordinator/Worker/Scratchpad architecture |
License
MIT License — see LICENSE for details.
Links
| Link | URL |
|---|---|
| GitHub (This Repo) | https://github.com/lulin70/DevSquad |
| Original / Upstream | https://github.com/weiransoft/TraeMultiAgentSkill |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file devsquad-3.6.1.tar.gz.
File metadata
- Download URL: devsquad-3.6.1.tar.gz
- Upload date:
- Size: 571.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c25705f19570c2119fa8becc3be3304e0380adee8032f76a8b22809d2a6e2990
|
|
| MD5 |
f53d4a04467e3d37700d970c1ad997b2
|
|
| BLAKE2b-256 |
d342c494d43cc395b6ad76e2ca47896014e95e0e59a5ac4f08e65b37e71be244
|
File details
Details for the file devsquad-3.6.1-py3-none-any.whl.
File metadata
- Download URL: devsquad-3.6.1-py3-none-any.whl
- Upload date:
- Size: 424.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d89fd4341c89fc6bcec5b5519d7ddc51d3a6a245b1b19bd985d4e02944b73d35
|
|
| MD5 |
baa424644778711baf984a68ded98bfa
|
|
| BLAKE2b-256 |
904d560b011493cf75693990eda257eb6f615bea954bf40ceded5b754047ddbd
|