An autonomous, LangGraph-powered AI development agency.
Project description
My Dev Team 🚀
An autonomous, LangGraph-powered AI development agency. My Dev Team takes raw project requirements and processes them through a multi-agent workflow (Product Manager, System Architect, Developers, and QA) to incrementally build, test, and deliver production-ready code.
Unlike third-party SaaS platforms, My Dev Team is a local-first orchestrator. Your workspace, SQLite state database, and review trails live 100% on your machine. You can run the entire crew locally for free using Ollama for zero data egress, or connect to cloud APIs (OpenAI, Groq) knowing your proprietary codebase is never stored on an external platform's servers.
Core Features
- Multi-Agent Architecture: Specialized AI agents handle distinct phases of the software development lifecycle.
- Local-First & Privacy-Focused: You own your data. The orchestrator, memory checkpointer, and file system execute strictly on your local hardware. Your code and requirements never sit on a third-party dashboard.
- Semantic Model Routing: Automatically routes tasks to the most cost-effective or capable LLMs based on the task type (reasoning, coding, or fast-utility).
- Strict Test-Driven Development (TDD): Testing is never an afterthought. Tasks are generated with embedded testing criteria, and the Developer writes unit tests alongside implementation code for immediate QA validation.
- State Recovery & Resiliency: Powered by asynchronous SQLite checkpointing. If an API rate limit is hit or a workflow is interrupted, you can resume the exact thread without losing a single token of progress.
- Telemetry & Cost Tracking: Automatically tallies prompt and completion tokens across the entire workflow. Calculates exact USD costs dynamically using LiteLLM's live pricing registry, printing a detailed receipt at the end of every run.
- Incremental Development: The System Architect breaks down requirements into a manageable backlog of strictly formatted JSON tasks.
- Self-Healing Code: The Developer, Reviewer, and QA Engineer agents continuously loop until unit tests pass and code meets specifications.
- Structured Outputs: Powered by Pydantic and LangChain, ensuring zero "Markdown spillage" and robust state management.
- Extensible: Easily add custom tools like
HumanInTheLooporWorkspaceSaver. - Cost & Token Optimization Analyzer: Built-in telemetry tracks API costs down to the fraction of a cent and generates a diagnostic report at the end of every run, actively warning you if agents are stuck in loops or suffering from context bloat.
AI Agents
- Product Manager: Analyzes requirements, asks clarifying questions, and writes detailed Technical Specifications.
- System Architect: Breaks specifications down into a cohesive backlog of developer tasks.
- Senior Developer: Incrementally writes code and unit tests for the current task.
- Code Reviewer: Analyzes the generated code for security, style, and logic issues.
- QA Engineer: Evaluates code against task requirements using either LLM-based mental simulation or execution via a secure Docker sandbox.
- Final QA Engineer: Performs a full-repository integration test once all tasks are complete.
- Reporter: Generates a comprehensive final Markdown report for stakeholders.
Getting Started
Prerequisites
- Python 3.10+
- API Keys set in your environment (e.g.,
OPENAI_API_KEY,GROQ_API_KEY), OR a local instance of Ollama running for free local models.
Optional Dependencies:
- Docker Engine required only if you intend to use the Sandboxed QA code execution features.
- Streamlit required only to launch the web dashboard. You can install it separately or run
pip install my-dev-team[ui].
Installation
Installing into a virtual environment is highly recommended.
You can install the package directly via pip:
pip install my-dev-team
For local development, clone the repository and run pip install -e .
Usage Guide
Preparing Your Project File
The crew requires a text file outlining your project requirements. By default, it looks for a specific header format to extract the project name and thread ID.
Create a file named project.txt:
Subject: NEW PROJECT: Web Scraper CLI
I need a Python command-line tool that scrapes articles from a given URL.
It should extract the title, author, and main body text, and save the output as a JSON file.
Requirements:
- Use BeautifulSoup4 for parsing.
- Include a `--url` argument and an `--output` argument.
- Write unit tests for the parsing logic.
Command Line Interface
The fastest way to use the framework is via the terminal command included in the package.
devteam project.txt
Web Interface (Dashboard)
In addition to the terminal CLI, My Dev Team includes a fully interactive web dashboard powered by Streamlit. This is perfect for users who want visual control over the autonomous agents.
Make sure you have Streamlit installed (pip install streamlit), and simply run:
devteam-ui
Advanced CLI Options
You can easily switch between cloud providers and local models, and adjust rate limits based on your API tier:
# Run entirely locally for free using Ollama, with no rate limit!
devteam project.txt --provider ollama
# Run using OpenAI's flagship models, limited to 15 requests per minute
devteam project.txt --provider openai --rpm 15
# Resume an interrupted run exactly where it left off
devteam --resume web_scraper_cli_20260312_083500
Available Arguments:
project_file: (Optional if resuming) Path to your project requirements text file.--provider: Choose the LLM backend. Options: groq, ollama (default), openai.--timeout: Maximum wait time for LLM responses, allowing users to easily adjust for slower local models.--rpm: API requests per minute. Set to 0 to disable rate limiting (default: 0).--resume: Resume a specific thread ID (e.g., my_app_20260312_083500).--history: Print the timeline of checkpoints for the thread and exit.--checkpoint: Specific checkpoint ID to rewind to
Note: Ensure you have the corresponding API keys (e.g., GROQ_API_KEY, OPENAI_API_KEY) set in your .env file, or ensure your local Ollama instance is running.
Dashboard Features
- Launch Projects: Upload your project requirements text file directly through your browser and select your LLM provider.
- Granular Timeline: View a deeply nested, chronological history of your AI crew's execution, cleanly displaying subgraph agent handoffs.
- Visual Time Travel: Easily resume paused workflows, or inject human-in-the-loop feedback by targeting specific graph checkpoints directly from the UI dropdowns.
Architecture
Multi-Agent Workflow
My Dev Team operates as a cyclic, self-healing state machine. Instead of a simple linear pipeline, agents pass context back and forth, iterating on code until it meets strict quality standards.
stateDiagram-v2
pm : Product Manager
human : Human in the Loop
architect : System Architect
officer : Project Officer
dev : Senior Developer
reviewer : Code Reviewer
qa : QA Engineer
final_qa : Final QA Engineer
reporter : Reporter
[*] --> pm
pm --> human
human --> pm
pm --> architect
architect --> officer
officer --> dev
dev --> reviewer
reviewer --> dev
reviewer --> qa
qa --> dev
qa --> officer
officer --> final_qa
final_qa --> reporter
reporter --> [*]
How the routing works:
-
Requirements Gathering: The Product Manager loops with a Human to refine requirements before development begins.
-
Task Orchestration: The System Architect designs the system, and the Project Officer orchestrates the task backlog, routing individual tickets to the Senior Developer.
-
The Refinement Loop: The Senior Developer, Code Reviewer, and QA Engineer agents operate in a strict self-healing loop. Code is repeatedly analyzed and tested; if bugs or style issues are found, the state routes directly back to the Senior Developer for revisions.
-
Final Delivery: Once the Project Officer confirms all tasks are complete, the Final QA Engineer runs full-repository integration tests before the Reporter generates the final documentation.
Intelligent Model Routing (LLM Factory)
My Dev Team doesn't just use one model for everything. It uses an advanced Semantic Routing architecture via LLMFactory.
Instead of hardcoding a specific model (like gpt-5.3-codex), each agent requests a specific capability category and temperature. The Factory evaluates your chosen --provider and dynamically spins up the most cost-effective, capable model for that exact task.
The Categories:
reasoning: For the System Architect and Product Manager. Maps to deep-thinking models.code-generator: For the Senior Developer. Maps to strict, syntax-heavy models.code-analyzer: For the QA and Reviewer agents. Maps to deep-context evaluation models.fast-utility: For the Reporter. Maps to blazing-fast, ultra-cheap models for simple text summarization.
Centralized Configuration
Code and configuration are strictly separated to make the framework maintainable and extensible.
- Model Routing (
config/llms.yaml): All provider definitions (Groq, OpenAI, Ollama) and model routing logic (reasoning, coding, fast-utility) are centralized in a single YAML file, making it trivial to update models as new ones are released. - Agent Prompts (
config/agents/**): Every agent's persona, system instructions, and constraints are stored as clean Markdown files with YAML frontmatter. No massive, hardcoded prompt strings cluttering the Python logic! - Sandbox Environments (
config/sandbox.yaml): Docker base images and test execution commands for various runtimes (Python, Node.js) are completely decoupled. You can easily add support for entirely new programming languages by simply defining the image and test command in YAML, without touching the core Python engine.
Sandboxed QA Execution
The QA Engineer agent does not rely on LLM "guesswork" or mental simulation to test code. It executes the generated code in reality.
- Zero Hallucinations: The QA node mounts the active workspace into a temporary directory and runs the actual test suite (e.g.,
pytest,npm test). It reads the exactstdout/stderrtracebacks to accurately report bugs back to the Developer. - Ephemeral Isolation: Code is executed securely using the Docker SDK. Containers are strictly isolated, resource-limited (CPU/RAM), and immediately destroyed after the test run, ensuring your host machine is never at risk.
- Universal Runtime Auto-Detection: The sandbox dynamically inspects the workspace or takes explicit direction from the System Architect to pull the correct Docker image (Python, Node.js, etc.) on the fly.
Telemetry & Optimization
Running multi-agent systems can get expensive quickly if models get stuck in loops or context windows grow out of control. My Dev Team includes a built-in TelemetryTracker that monitors every single LLM call.
At the end of every workflow, the framework prints a granular receipt and an optimization diagnostic report:
========================================
📊 TELEMETRY & COST REPORT
========================================
Total API Requests: 12
Prompt Tokens: 45,200
Completion Tokens: 3,100
Total Tokens: 48,300
----------------------------------------
Estimated Cost: $0.0145
========================================
========================================
🔍 TOKEN OPTIMIZATION DIAGNOSTICS
========================================
⚠️ Thrashing Detected: `qa` was called 8 times. The agent might be stuck in a failure loop.
📈 Context Bloat: `reviewer` input grew by 3.2x (Started: 1200, Ended: 3840).
========================================
This allows you to easily identify architectural token leaks, pinpoint which specific agent is struggling, and adjust your llms.yaml or prompt templates accordingly!
Usage (Python API)
If you want to integrate the crew into your own application, customize the LLM Factory's routing table, or override specific agent behaviors, use the clean Python API:
import asyncio
import aiosqlite
from pathlib import Path
from dotenv import load_dotenv
from langgraph.checkpoint.sqlite.aio import AsyncSqliteSaver
from devteam import VirtualCrew, ProjectManager, LLMFactory
from devteam.agents import ProductManager, SystemArchitect, SeniorDeveloper, CodeReviewer, QAEngineer, FinalQAEngineer, Reporter
from devteam.extensions import HumanInTheLoop, WorkspaceSaver
from devteam.tools import DockerSandbox
from devteam.utils import RateLimiter, TelemetryTracker
load_dotenv()
def my_agents() -> dict:
"""Initialize agents using built-in prompt templates."""
return {
'pm': ProductManager.from_config('pm', 'product-manager.md'),
'architect': SystemArchitect.from_config('architect', 'system-architect.md'),
'developer': SeniorDeveloper.from_config('developer', 'senior-developer.md'),
'reviewer': CodeReviewer.from_config('reviewer', 'code-reviewer.md'),
'qa': QAEngineer.from_config('qa', 'qa-engineer-sandbox.md').with_sandbox(DockerSandbox()),
'final_qa': FinalQAEngineer.from_config('final_qa', 'final-qa-engineer.md'),
# Example: Forcing the reporter to use a more creative reasoning model
'reporter': Reporter.from_config('reporter', 'reporter.md', model_category='reasoning', temperature=0.7)
}
def my_extensions(project_folder: Path) -> list:
"""Add extensions like saving files to disk or requiring human approval."""
return [
WorkspaceSaver(workspace_dir=project_folder),
HumanInTheLoop()
]
def build_crew(project_folder: Path, llm_factory: LLMFactory, checkpointer: AsyncSqliteSaver, rpm: int = 0) -> VirtualCrew:
return VirtualCrew(
manager=ProjectManager(),
agents=my_agents(),
llm_factory=llm_factory,
extensions=my_extensions(project_folder),
checkpointer=checkpointer,
rate_limiter=RateLimiter(requests_per_minute=rpm) if rpm > 0 else None
)
async def main():
requirements = "Build a simple Python calculator CLI with basic arithmetic."
thread_id = 'calc_run_01'
workspace = Path('./workspaces/calculator_app')
workspace.mkdir(parents=True, exist_ok=True)
db_path = workspace / 'state.db'
telemetry = TelemetryTracker()
llm_factory = LLMFactory(provider='groq', callbacks=[telemetry])
try:
async with aiosqlite.connect(db_path) as conn:
checkpointer = AsyncSqliteSaver(conn)
crew = build_crew(workspace, llm_factory, checkpointer, rpm=30)
print("🚀 Starting the AI Dev Team...")
final_state = await crew.execute(
thread_id=thread_id,
requirements=requirements
)
if final_state.abort_requested:
print("❌ Workflow aborted by user or validation failure.")
elif final_state.success:
print("🎉 Project completed successfully!")
print(f"Total Revisions: {final_state.total_revisions}")
if final_state.final_report:
print(final_state.final_report)
else:
print("🚨 Release failed: Integration bugs found!")
for bug in final_state.integration_bugs:
print(f" - {bug}")
except KeyboardInterrupt:
print("\n\n🛑 Workflow interrupted by user (Ctrl+C).")
print(f"💡 You can resume this exact state later by running:")
print(f" devteam --resume {thread_id}")
finally:
telemetry.print_receipt()
if __name__ == "__main__":
asyncio.run(main())
Customizing Crew
My Dev Team is completely configuration-driven. You don't need to write Python code to change how the agents are built or which prompts they use.
The entire agent roster is defined in config/crew.yaml. The framework uses dynamic module reflection to read this file and build the LangGraph workflow on the fly.
Example config/small_crew.yaml:
agents:
pm:
class: ProductManager
config: product-manager.md
developer:
class: SeniorDeveloper
config: senior-developer.md
You can rewrite the previous example as follows:
from devteam.utils import build_agents_from_config
def my_extensions(project_folder: Path) -> list:
return [
WorkspaceSaver(workspace_dir=project_folder),
HumanInTheLoop()
]
def build_crew(project_folder: Path, llm_factory: LLMFactory, checkpointer: AsyncSqliteSaver, rpm: int = 0) -> VirtualCrew:
"""Instantiates the agents and returns the crew instance"""
return VirtualCrew(
manager=ProjectManager(),
agents=build_agents_from_config('basic.yaml'),
extensions=my_extensions(project_folder),
llm_factory=llm_factory,
checkpointer=checkpointer,
rate_limiter=RateLimiter(requests_per_minute=rpm) if rpm > 0 else None
)
Contributing
Pull requests are welcome. For major changes, please open an issue first...
License
Distributed under the Apache 2.0 license. See LICENSE for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file my_dev_team-0.4.2.tar.gz.
File metadata
- Download URL: my_dev_team-0.4.2.tar.gz
- Upload date:
- Size: 60.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcb7927e9b8fe71911f177da2b24a520a08de38898a25be5b82de9ab94dd2f7b
|
|
| MD5 |
d03e6a32c70c2a8edd4d3678ef183c19
|
|
| BLAKE2b-256 |
74d19602861f6b0b69b1b33b08bffb77d323693a2e826ab6f1212274ced06f0f
|
File details
Details for the file my_dev_team-0.4.2-py3-none-any.whl.
File metadata
- Download URL: my_dev_team-0.4.2-py3-none-any.whl
- Upload date:
- Size: 69.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
149f535e786fdb4aff1c1051c3f237e19985dfb32100062df611572819681b26
|
|
| MD5 |
b02225347d2abf36cb67f7efa65b0901
|
|
| BLAKE2b-256 |
f3a540f0ce1a5114d75c77b5cffaf2b515a02bfabb73557d8a6bc417e085d707
|