Autonomous agent loop that turns prompts into shipped PRs.
Project description
ColonyOS
The fully autonomous AI pipeline that builds itself.
Installation · Quickstart · How It Works · CLI · Config · Slack · Dashboard · Architecture
ColonyOS is an autonomous software engineering pipeline. Give it a feature description — or let its built-in CEO agent decide what to build — and it writes a PRD, implements the code with tests, runs parallel multi-persona code reviews, fixes issues, and opens a pull request. No human in the loop.
Under the hood it orchestrates Claude agent sessions via the Claude Agent SDK with full codebase awareness. Point it at any repo and let it work.
ColonyOS builds itself. Every feature, fix, and review in this repo was proposed, implemented, and shipped by ColonyOS agents running on their own codebase.
Installation
Prerequisites
| Dependency | Why | Check |
|---|---|---|
| Python 3.11+ | Runtime | python3 --version |
| Claude Code CLI | Agent execution engine | claude --version |
| Git | Branch/commit operations | git --version |
| GitHub CLI | PR creation, issue fetching | gh auth status |
Don't have Claude Code CLI yet? See Setting up Claude Code below.
Install ColonyOS
Pick one:
# Recommended — isolated install via pipx
pipx install colonyos
# Alternative — standard pip
pip install colonyos
# One-liner for CI or fresh machines
curl -sSL https://raw.githubusercontent.com/rangelak/ColonyOS/main/install.sh | sh
# Homebrew (macOS / Linux)
brew tap colonyos/tap && brew install colonyos
Optional extras
# Slack integration (Socket Mode listener)
pip install "colonyos[slack]"
# Web dashboard (local FastAPI + React UI)
pip install "colonyos[ui]"
# Development (tests, pre-commit hooks, dashboard)
pip install "colonyos[dev]"
Verify your environment
colonyos doctor
This checks Python version, Claude Code CLI authentication, Git, GitHub CLI, and optional dependencies. Fix anything it flags before continuing.
Quickstart
1. Initialize a project
cd your-project/
colonyos init
The interactive wizard asks for your project name, description, tech stack, and lets you pick a persona pack (e.g. "Startup Team", "Enterprise Backend", "Frontend/Design") or define custom expert personas.
For zero-prompt setup:
colonyos init --quick --name "MyApp" --description "B2B analytics" --stack "Python/FastAPI"
2. Run the pipeline
Directed mode — you choose what to build:
colonyos run "Add a health check endpoint"
From a GitHub issue:
colonyos run --issue 42
Autonomous mode — the CEO agent decides what to build:
colonyos auto
Long-running autonomous loop — walk away for the day:
colonyos auto --loop 50 --max-hours 24 --max-budget 500 --no-confirm
That's it. ColonyOS generates a PRD, implements the code with tests, runs multi-persona code review, and opens a pull request.
How It Works
ColonyOS runs a multi-phase pipeline where each phase is an isolated Claude agent session with its own instruction template and budget cap.
flowchart LR
CEO["🧠 CEO\nPropose"] --> Plan["📋 Plan\nPRD + Tasks"]
Plan --> Impl["⚡ Implement\nCode + Tests"]
Impl --> Verify["✅ Verify\nRun Tests"]
Verify --> Review["🔍 Review / Fix\nLoop"]
Review --> Decision{"🎯 Decision\nGate"}
Decision -->|GO| Deliver["🚀 Deliver\nPush + PR"]
Decision -->|NO-GO| Stop["⛔ Stopped"]
| Phase | What happens |
|---|---|
| CEO (auto mode only) | Analyzes the project, its history (reads CHANGELOG.md), and strategic direction. Proposes the single highest-impact feature to build next. |
| Plan | Explores the codebase, generates a PRD with clarifying Q&A from your defined personas (running as parallel subagents), and produces a task breakdown. |
| Implement | Creates a feature branch, writes tests first, then implements each task. Commits incrementally. |
| Verify (optional) | Runs your test command (e.g. pytest, npm test). Failed tests trigger implement retries with failure context before the expensive review phase. |
| Review / Fix Loop | Reviewer personas run independent, parallel, read-only reviews. If any request changes, a Staff+ fix agent addresses findings, then reviewers re-run. Repeats up to max_fix_iterations. |
| Decision Gate | Reads all review artifacts and makes a GO / NO-GO verdict. NO-GO halts the pipeline. |
| Deliver | Pushes the branch, opens a PR linking back to the PRD, and updates CHANGELOG.md. |
| Learn | Extracts patterns from review artifacts and persists them to .colonyos/learnings.md for future runs. |
Review / Fix loop detail
flowchart TD
Start["Start Review Round"] --> Parallel["Parallel Persona Reviews\n(read-only, concurrent)"]
Parallel --> Check{"All\napprove?"}
Check -->|Yes| Gate["Decision Gate"]
Check -->|No| Fix["Fix Agent\nAddress Findings"]
Fix --> Budget{"Budget &\niterations\nremaining?"}
Budget -->|Yes| Parallel
Budget -->|No| Gate
CLI Reference
Getting started
| Command | Description |
|---|---|
colonyos doctor |
Check all prerequisites and environment health |
colonyos init |
Interactive project setup + persona workshop |
colonyos init --quick |
Zero-prompt setup with defaults |
colonyos init --personas |
Re-run just the persona workshop |
colonyos status |
Show recent runs, loop summaries, and cost breakdown |
Running the pipeline
| Command | Description |
|---|---|
colonyos run "prompt" |
Directed mode — plan, implement, review, deliver |
colonyos run "prompt" --plan-only |
Stop after PRD + tasks (no code) |
colonyos run --from-prd PATH |
Skip planning, implement an existing PRD |
colonyos run --issue NUMBER |
Use a GitHub issue as the feature prompt |
colonyos run --resume RUN_ID |
Resume a failed run from the last successful phase |
colonyos run --offline |
Skip remote git checks (preflight) |
colonyos run --force |
Bypass preflight warnings |
Autonomous mode
| Command | Description |
|---|---|
colonyos auto |
CEO proposes a feature and the pipeline builds it |
colonyos auto --loop N |
Run N autonomous cycles back-to-back |
colonyos auto --max-hours H |
Stop loop after H wall-clock hours |
colonyos auto --max-budget USD |
Stop loop after USD aggregate spend |
colonyos auto --no-confirm |
Skip human approval of CEO proposals |
colonyos auto --propose-only |
CEO proposes but does not execute |
colonyos auto --resume-loop |
Resume the most recent interrupted loop |
Code review
| Command | Description |
|---|---|
colonyos review BRANCH |
Standalone multi-persona review on any branch |
colonyos review --base BRANCH |
Base branch to diff against (default: main) |
colonyos review --no-fix |
Review only, skip the fix loop |
colonyos review --decide |
Run the decision gate after reviews |
Execution queue
| Command | Description |
|---|---|
colonyos queue add "p1" "p2" --issue 42 |
Enqueue prompts and/or GitHub issues |
colonyos queue start |
Process pending items sequentially |
colonyos queue start --max-cost N |
Aggregate USD cap for the queue |
colonyos queue start --max-hours N |
Wall-clock cap for the queue |
colonyos queue status |
Show queue state |
colonyos queue clear |
Remove all pending items |
colonyos queue unpause |
Unpause the queue after a circuit breaker trip |
CI fix
| Command | Description |
|---|---|
colonyos ci-fix PR_NUMBER |
Fetch CI failure logs and auto-fix the code |
colonyos ci-fix PR --wait |
Fix, then wait for CI re-run to pass |
colonyos ci-fix PR --max-retries N |
Retry the fix-push-wait cycle up to N times |
Analytics & inspection
| Command | Description |
|---|---|
colonyos stats |
Aggregate analytics dashboard (cost, duration, failures) |
colonyos stats --last N |
Limit to the N most recent runs |
colonyos stats --phase NAME |
Drill into a specific phase |
colonyos show RUN_ID |
Detailed single-run inspection |
colonyos show RUN_ID --json |
Machine-readable JSON output |
colonyos show RUN_ID --phase NAME |
Inspect a specific phase within a run |
Codebase maintenance
| Command | Description |
|---|---|
colonyos cleanup branches |
List merged colonyos/ branches (dry-run) |
colonyos cleanup branches --execute |
Delete merged branches |
colonyos cleanup branches --include-remote |
Also prune from origin |
colonyos cleanup artifacts |
List stale run artifacts beyond retention |
colonyos cleanup artifacts --execute |
Delete stale artifacts |
colonyos cleanup artifacts --retention-days N |
Override retention period (default: 30) |
colonyos cleanup scan |
Find large files, long functions, dead code |
colonyos cleanup scan --ai |
AI-powered qualitative analysis |
Slack & Dashboard
| Command | Description |
|---|---|
colonyos watch |
Watch Slack channels and trigger runs from messages |
colonyos watch --dry-run |
Log triggers without executing |
colonyos watch --max-hours N |
Wall-clock limit for the watcher |
colonyos watch --max-budget N |
Aggregate USD spend limit |
colonyos ui |
Launch the local web dashboard |
colonyos ui --port N |
Custom port (default: 7400) |
colonyos ui --no-open |
Don't auto-open the browser |
Global flags
| Flag | Applies to | Description |
|---|---|---|
-v / --verbose |
run, auto |
Stream agent text alongside tool activity |
-q / --quiet |
run, auto |
Suppress the streaming UI |
Configuration Reference
Configuration lives at .colonyos/config.yaml, created by colonyos init.
Project & personas
project:
name: "MyApp"
description: "B2B analytics platform"
stack: "Python/FastAPI, React, PostgreSQL"
personas:
- role: "Senior Backend Engineer"
expertise: "API design, database modeling, performance"
perspective: "Thinks about scalability and data integrity"
reviewer: true # participates in code reviews (default: false)
- role: "Product Lead"
expertise: "User research, prioritization"
perspective: "Thinks about user value and shipping incrementally"
# reviewer defaults to false — plan-phase only
Personas are the expert panel that reviews your code and asks clarifying questions during planning. Mark reviewer: true on personas you want in the review loop.
Model selection
model: opus # global default: opus | sonnet | haiku
# Per-phase overrides — route mechanical phases to cheaper models
phase_models:
plan: sonnet
implement: opus
review: opus
fix: sonnet
deliver: haiku
ceo: opus
verify: haiku
learn: haiku
Using per-phase overrides can reduce costs by 50–70% while keeping opus for deep reasoning tasks.
Budget & safety caps
budget:
per_phase: 5.00 # USD per Claude agent session
per_run: 15.00 # USD total cap for a full pipeline run
max_duration_hours: 8.0 # wall-clock cap for autonomous loops
max_total_usd: 500.0 # aggregate spend cap for autonomous loops
Pipeline phases
phases:
plan: true
implement: true
review: true # parallel per-persona reviews + fix loop
deliver: true # set false to skip PR creation
max_fix_iterations: 2 # review/fix cycles before decision gate
auto_approve: true # skip human confirmation in autonomous mode
Verification gate
verification:
verify_command: "pytest --tb=short -q" # auto-detected by `colonyos init`
max_verify_retries: 2
verify_timeout: 120 # seconds
Runs your test suite between implement and review. Failed tests trigger implement retries with failure context — catching bugs before the expensive review phase.
CI fix
ci_fix:
enabled: false # set true to auto-fix CI failures in deliver phase
max_retries: 2 # retry fix-push-wait cycle
wait_timeout: 600 # seconds to wait for CI re-run
log_char_cap: 12000 # truncate CI logs sent to the fix agent
Cross-run learnings
learnings:
enabled: true
max_entries: 100
After each run, the pipeline extracts patterns from review artifacts and persists them to .colonyos/learnings.md. Future implement and fix phases see these learnings as context, enabling the pipeline to self-improve across iterations.
Codebase cleanup
cleanup:
branch_retention_days: 0 # 0 = prune all merged colonyos/ branches
artifact_retention_days: 30 # run logs older than this are stale
scan_max_lines: 500 # flag files exceeding this line count
scan_max_functions: 20 # flag files with more functions than this
Branch & directory naming
branch_prefix: "colonyos/"
prds_dir: "cOS_prds"
tasks_dir: "cOS_tasks"
reviews_dir: "cOS_reviews"
proposals_dir: "cOS_proposals"
Slack Integration
ColonyOS can listen to Slack channels and trigger pipeline runs from messages — your team can request features or report bugs directly in Slack without context-switching.
1. Create a Slack app
The fastest way is with the included manifest — it pre-configures all scopes, events, and Socket Mode in one step:
- Go to api.slack.com/apps and click Create New App → From an app manifest.
- Select your workspace.
- Choose YAML and paste the contents of
slack-app-manifest.yamlfrom this repo. - Click Create.
Manual setup (without manifest)
- Go to api.slack.com/apps → Create New App → From scratch.
- OAuth & Permissions — add these Bot Token Scopes:
app_mentions:read— detect@ColonyOSmentionschannels:history— read channel messageschat:write— post progress updatesreactions:read— detect emoji triggers (if using reaction mode)reactions:write— acknowledge messages with reactions
- Socket Mode — enable Socket Mode and generate an App-Level Token with the
connections:writescope. - Event Subscriptions — subscribe to bot events:
app_mention,message.channels,reaction_added.
2. Install the app and grab tokens
- In your Slack app settings, go to Settings → Socket Mode → generate an App-Level Token with
connections:writescope. Copy thexapp-...token. - Go to Install App → Install to Workspace and authorize. Copy the Bot User OAuth Token (
xoxb-...).
3. Set environment variables
export COLONYOS_SLACK_BOT_TOKEN="xoxb-your-bot-token"
export COLONYOS_SLACK_APP_TOKEN="xapp-your-app-level-token"
4. Configure ColonyOS
Add a slack section to .colonyos/config.yaml:
slack:
enabled: true
channels:
- bugs # channel names (with or without #) or IDs
- feature-requests
trigger_mode: mention # mention | reaction | slash_command
auto_approve: false # require human approval before executing
max_runs_per_hour: 3 # rate limit
allowed_user_ids: [] # empty = allow all workspace members
daily_budget_usd: 50.0 # optional daily spend cap
max_queue_depth: 20 # max pending items from Slack
max_consecutive_failures: 3 # pause watcher after N consecutive failures
5. Start the watcher
# Install the Slack extra if you haven't
pip install "colonyos[slack]"
# Start watching
colonyos watch
The watcher runs as a long-lived process using Slack Bolt SDK with Socket Mode (no public URL required). When someone mentions @ColonyOS fix the login bug in a configured channel, the watcher sanitizes the input, triggers a pipeline run, and posts threaded progress updates back to the Slack thread.
Trigger modes:
| Mode | How it works |
|---|---|
mention |
@ColonyOS <prompt> in any configured channel |
reaction |
Add a specific emoji reaction to any message |
slash_command |
/colonyos <prompt> (requires slash command setup in Slack app config) |
Web Dashboard
ColonyOS includes a local web dashboard for monitoring runs, viewing costs, and managing configuration.
pip install "colonyos[ui]"
colonyos ui
This starts a FastAPI server at http://localhost:7400 serving a React + Tailwind SPA. The dashboard provides:
- Run history with phase-by-phase timelines
- Cost trend charts across runs
- Inline configuration editing (config, personas)
- Run launching from the browser
- Artifact previews (PRDs, reviews, proposals)
The dashboard is localhost-only and supports bearer token authentication for write operations.
Output Structure
ColonyOS creates cOS_-prefixed directories in your repo as a timestamped audit trail:
your-repo/
├── CHANGELOG.md # auto-updated by deliver phase
├── cOS_prds/
│ └── 20260317_172530_prd_stripe_billing.md # generated PRDs
├── cOS_tasks/
│ └── 20260317_172530_tasks_stripe_billing.md # task breakdowns
├── cOS_reviews/
│ ├── decisions/
│ │ └── 20260317_decision_stripe_billing.md # GO/NO-GO verdicts
│ └── reviews/
│ ├── linus_torvalds/
│ │ └── 20260317_review_stripe_billing.md
│ └── staff_security_engineer/
│ └── 20260317_review_stripe_billing.md
├── cOS_proposals/
│ └── 20260317_proposal_ceo_proposal.md # CEO proposals
└── .colonyos/
├── config.yaml # project configuration
├── learnings.md # cross-run learnings
└── runs/ # run logs (gitignored)
└── run-20260317-abc123.json
The CEO reads CHANGELOG.md before proposing new features to avoid duplicating past work. Run logs (costs, durations, session IDs) and loop state go to .colonyos/runs/, which is gitignored.
Architecture
src/colonyos/
├── cli.py # Click CLI — all commands, REPL mode, welcome banner
├── init.py # Interactive setup wizard, persona packs, --quick mode
├── orchestrator.py # Phase chaining: CEO → Plan → Implement → Verify → Review → Deliver → Learn
├── agent.py # Claude Agent SDK wrapper, parallel execution support
├── config.py # .colonyos/config.yaml loader + validation
├── models.py # Persona, PhaseResult, RunLog, LoopState, QueueItem
├── naming.py # Deterministic timestamped filenames, slug generation
├── persona_packs.py # Prebuilt persona packs (startup, backend, fullstack, opensource)
├── ui.py # Rich streaming terminal UI, phase progress display
├── stats.py # Aggregate analytics computation + Rich rendering
├── show.py # Single-run inspector (data layer + render layer)
├── doctor.py # Prerequisite validation (Python, Claude, Git, GitHub CLI)
├── github.py # GitHub issue fetching, PR helpers
├── ci.py # CI failure detection, log retrieval, fix agent
├── cleanup.py # Branch pruning, artifact cleanup, structural scan
├── slack.py # Slack Bolt listener, dedup, threaded replies
├── server.py # FastAPI API for the web dashboard
├── learnings.py # Cross-run learning extraction + persistence
├── sanitize.py # Input sanitization for untrusted content (Slack, issues)
├── instructions/ # Markdown templates passed as system prompts
│ ├── ceo.md # Autonomous feature proposal
│ ├── plan.md # PRD + task generation with persona Q&A
│ ├── implement.md # Test-first implementation
│ ├── review.md # Per-persona structured review with VERDICT output
│ ├── fix.md # Staff+ engineer fix agent
│ ├── decision.md # GO/NO-GO decision gate
│ ├── deliver.md # Branch push + PR creation
│ ├── verify_fix.md # Test failure fix instructions
│ ├── ci_fix.md # CI failure fix instructions
│ ├── learn.md # Learning extraction
│ ├── cleanup_scan.md # Structural analysis
│ ├── review_standalone.md
│ ├── fix_standalone.md
│ └── decision_standalone.md
└── web_dist/ # Pre-built Vite SPA (React + Tailwind)
Instruction templates are shipped with the package. Override any of them by placing a file with the same name in .colonyos/instructions/ in your repo.
Security Model
ColonyOS runs Claude Code sessions with bypassPermissions enabled — the agent has full read/write/execute access within your repository. This is by design: the agent needs to create branches, write code, run tests, and push to GitHub.
What this means for you:
- Only run ColonyOS on repos where you trust the agent to modify files.
- Use budget caps (
per_run,max_total_usd) to limit blast radius. - Review generated PRs before merging, just as you would for any contributor.
- Slack integration sanitizes all incoming messages to mitigate prompt injection, but treat it as defense-in-depth, not a guarantee.
- Long-running loops with
auto_approve: trueamplify the scope of autonomous action — set conservative budget and time caps.
Development
git clone https://github.com/rangelak/ColonyOS.git
cd ColonyOS
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest
The [dev] extra installs pytest, pre-commit, and the dashboard dependencies. A pre-commit hook runs pytest before every commit to prevent regressions.
Releasing
ColonyOS uses tag-based automated releases. The version is derived from git tags via setuptools-scm — there is no hardcoded version string.
git tag v0.2.0
git push origin v0.2.0
# CI automatically: runs tests → builds → publishes to PyPI → creates GitHub Release
Setting up Claude Code
ColonyOS requires the Claude Code CLI as its execution engine. If you don't have it yet:
1. Install Node.js (required by Claude Code):
# macOS
brew install node
# Linux (Debian/Ubuntu)
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs
2. Install Claude Code CLI:
npm install -g @anthropic-ai/claude-code
3. Authenticate:
claude
# Follow the prompts to connect your Anthropic account
4. Install GitHub CLI (needed for PR creation and issue fetching):
# macOS
brew install gh
# Linux (Debian/Ubuntu)
sudo apt install gh
5. Authenticate GitHub CLI:
gh auth login
6. Verify everything:
claude --version
gh auth status
colonyos doctor
License
MIT — Rangel Milushev
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file colonyos-0.2.1.tar.gz.
File metadata
- Download URL: colonyos-0.2.1.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ff5fc7433e8b1be746534cf79af02957d8bd4d8189d65372afc80bb8f56a93a
|
|
| MD5 |
910776885384b81b6a2b3dd6fb272d3d
|
|
| BLAKE2b-256 |
befc49d15ef3d82a9f6eedaaf9c4711afd64b3070ab564b19ba272747403e772
|
Provenance
The following attestation bundles were made for colonyos-0.2.1.tar.gz:
Publisher:
release.yml on rangelak/ColonyOS
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colonyos-0.2.1.tar.gz -
Subject digest:
4ff5fc7433e8b1be746534cf79af02957d8bd4d8189d65372afc80bb8f56a93a - Sigstore transparency entry: 1137747135
- Sigstore integration time:
-
Permalink:
rangelak/ColonyOS@9b3de5c4d747cdc2e2a44ebd56c1fbca27279185 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/rangelak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9b3de5c4d747cdc2e2a44ebd56c1fbca27279185 -
Trigger Event:
push
-
Statement type:
File details
Details for the file colonyos-0.2.1-py3-none-any.whl.
File metadata
- Download URL: colonyos-0.2.1-py3-none-any.whl
- Upload date:
- Size: 141.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77740a238df1b39c4d7d5e9b5756617dd375092daca1cac65adce0df1cdb5848
|
|
| MD5 |
3c786296eb71737d1a2fab70fa4b8650
|
|
| BLAKE2b-256 |
33022b0ba24a92b22a4c467183491b3d2ab5f30b8e4231ff75e871a699f22ed0
|
Provenance
The following attestation bundles were made for colonyos-0.2.1-py3-none-any.whl:
Publisher:
release.yml on rangelak/ColonyOS
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colonyos-0.2.1-py3-none-any.whl -
Subject digest:
77740a238df1b39c4d7d5e9b5756617dd375092daca1cac65adce0df1cdb5848 - Sigstore transparency entry: 1137747189
- Sigstore integration time:
-
Permalink:
rangelak/ColonyOS@9b3de5c4d747cdc2e2a44ebd56c1fbca27279185 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/rangelak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9b3de5c4d747cdc2e2a44ebd56c1fbca27279185 -
Trigger Event:
push
-
Statement type: