Multi-agent orchestration for CLI coding agents
Project description
Multi-agent orchestration for agentic engineering.
One command. Multiple coding agents. Your codebase moves forward while you sleep.
pipx install bernstein # or: uv tool install bernstein
bernstein -g "Add JWT auth with refresh tokens, tests, and API docs"
Bernstein takes a goal, breaks it into tasks, assigns them to AI coding agents running in parallel, verifies the output, and commits the results. You come back to working code, passing tests, and a clean git history.
No framework to learn. No vendor lock-in. If you have Claude Code, Codex CLI, Gemini CLI, or Qwen installed, Bernstein uses them. Mix models in the same run — cheap free-tier agents for boilerplate, heavy models for architecture. Switch providers without rewriting anything. Agents spawn, work, exit. No context drift. No babysitting.
The orchestrator is deterministic Python -- zero LLM tokens on coordination. A janitor verifies every result: tests pass, files exist, no regressions.
Agentic engineering is the practice of orchestrating AI agents to write code while humans own architecture and quality (Karpathy, 2026). Most tools make you a conductor -- one agent, synchronous, pair-programming. Bernstein makes you an orchestrator -- multiple agents, parallel, asynchronous (Osmani). No vibe coding. Deterministic scheduling, verified output, portable across providers.
[!TIP] Run
bernstein --headlessfor CI pipelines and overnight runs. Add--evolvefor continuous self-improvement.
Quick start
# 1. Install
pipx install bernstein # or: uv tool install bernstein
# 2. Init (auto-detects project type, creates bernstein.yaml)
cd your-project
bernstein init
# 3. Run -- set a goal inline or edit bernstein.yaml first
bernstein -g "Add rate limiting and improve test coverage"
bernstein # reads from bernstein.yaml
See examples/quickstart/ for a ready-to-run example with a Flask app and pre-configured bernstein.yaml.
The Bernstein Way — architecture tenets and default workflow
All CLI commands
bernstein stop # graceful shutdown
bernstein ps # show running agent processes
bernstein cancel <task_id> # cancel a task
bernstein cost # show cost summary
bernstein live # open live TUI dashboard
bernstein dashboard # open web dashboard in browser
bernstein doctor # health check: deps, keys, ports
bernstein plugins # list active plugins
bernstein trace <task_id> # step-by-step agent decision trace
bernstein replay <trace_id> # re-run a task from its trace
bernstein init # initialize project
bernstein evolve review # list evolution proposals
bernstein evolve approve <id> # approve a proposal
bernstein benchmark run # run golden benchmark suite
bernstein agents sync # pull latest agent catalog
bernstein agents list # list available agents
bernstein agents validate # check catalog health
bernstein workspace # show multi-repo workspace status
bernstein plan # show task backlog
bernstein logs # tail agent log output
bernstein demo # zero-to-running demo
bernstein ideate # run creative evolution pipeline
bernstein retro # generate retrospective report
Agent catalogs
Hire specialist agents from Agency (100+ agents, default) or plug in your own:
# bernstein.yaml
catalogs:
- name: agency
type: agency
source: https://github.com/msitarzewski/agency-agents
priority: 100
- name: my-team
type: generic
path: ./our-agents/
priority: 50
Self-evolution
Leave it running. It gets better.
bernstein --evolve --max-cycles 10 --budget 5.00
Analyzes metrics, proposes changes to prompts and routing rules, sandboxes them, and auto-applies what passes. Critical files are SHA-locked. Circuit breaker halts on test regression. Risk-stratified: L0 auto-apply, L1 sandbox-first, L2 human review, L3 blocked.
Supported agents
| Agent | Provider | Models (Mar 2026) | CLI flag | Install |
|---|---|---|---|---|
| Aider | OpenAI / Anthropic / any | any | --cli aider |
pip install aider-chat |
| Amp | Sourcegraph | opus 4.6, gpt-5.4 | --cli amp |
brew install amp |
| Claude Code | Anthropic | opus 4.6, sonnet 4.6, haiku 4.5 | --cli claude |
npm install -g @anthropic-ai/claude-code |
| Codex CLI | OpenAI | gpt-5.4, o3, o4-mini | --cli codex |
npm install -g @openai/codex |
| Gemini CLI | gemini-3-pro, 3-flash | --cli gemini |
npm install -g @google/gemini-cli |
|
| Qwen | Alibaba / OpenRouter | qwen3-coder, qwen-max | --cli qwen |
npm install -g qwen-code |
| Any CLI agent | Yours | pass-through | --cli generic |
Provide --cli-command and --prompt-flag |
Mix and match in a single run — the orchestrator doesn't care which agent handles which task:
# Claude on architecture, Codex on tests, Gemini on docs
bernstein -g "Refactor auth module, add tests, update API docs" \
--cli auto # default: auto-detects installed agents
# override per task via bernstein.yaml roles config
Why this matters: every other agentic coding framework (OpenAI Agents SDK, Google ADK, Anthropic Agent SDK) ties your orchestration to one provider. Bernstein doesn't. Your prompts, task graphs, and agent roles are portable. Swap providers without touching your workflow.
See docs/adapters.html for a feature matrix and the "bring your own agent" guide.
Specialist roles
manager backend frontend qa security architect devops reviewer docs ml-engineer prompt-engineer retrieval vp analyst resolver visionary
Tasks default to backend if no role is specified. The orchestrator checks agent catalogs for a specialized match before falling back to built-in roles.
Task server API
# Create a task
curl -X POST http://127.0.0.1:8052/tasks \
-H "Content-Type: application/json" \
-d '{"title": "Add rate limiting", "role": "backend", "priority": 1}'
# List / status
curl http://127.0.0.1:8052/tasks?status=open
curl http://127.0.0.1:8052/status
Any tool, CI pipeline, Slack bot, or custom UI can create tasks and read status.
How it compares
| Bernstein | CrewAI | AutoGen | LangGraph | Ruflo | |
|---|---|---|---|---|---|
| Scheduling | Deterministic code | LLM-based | LLM-based | Graph | LLM-based |
| Agent lifetime | Short (minutes) | Long-running | Long-running | Long-running | Long-running |
| Verification | Built-in janitor | Manual | Manual | Manual | Manual |
| Self-evolution | Yes (risk-gated) | No | No | No | Yes |
| CLI agents | Claude/Codex/Gemini/Qwen | API-only | API-only | API-only | Claude-only |
| Model lock-in | None | Soft (LiteLLM) | Soft (LiteLLM) | Soft (LiteLLM) | Claude-only |
| Agent catalogs | Yes (Agency + custom) | No | No | No | No |
CrewAI, AutoGen, and LangGraph work with any model via API wrappers — but they require you to write Python code to orchestrate. Ruflo uses self-evolution but ties you to Claude. Bernstein works with installed CLI agents (no API key plumbing, no SDK) and doesn't care which provider you use.
Full comparison pages → — detailed feature matrices, benchmark data, and "when to use X instead" guides for Conductor, Dorothy, Parallel Code, Crystal, Stoneforge, GitHub Agent HQ, and single-agent workflows.
Observability
bernstein ps # which agents are running, what role, which model
bernstein doctor # pre-flight check: Python, CLI tools, API keys, ports
bernstein trace T-042 # step-by-step view of what agent did and why
Agents are visible in Activity Monitor / ps as bernstein: <role> [<session>] — no more hunting for mystery Python processes.
Prometheus metrics at /metrics — wire up Grafana, set alerts, monitor cost.
Extensibility
Pluggy-based plugin system. Hook into any lifecycle event:
from bernstein.plugins import hookimpl
class SlackNotifier:
@hookimpl
def on_task_completed(self, task_id, role, result_summary):
slack.post(f"#{role} finished {task_id}: {result_summary}")
Install via entry points (pip install bernstein-plugin-slack) or local config in bernstein.yaml.
GitHub App integration
Install a GitHub App on your repository to automatically convert GitHub events into Bernstein tasks. Issues become backlog items, PR review comments become fix tasks, and pushes trigger QA verification.
bernstein github setup # print setup instructions
bernstein github test-webhook # verify configuration
Set GITHUB_WEBHOOK_SECRET and point webhooks at POST /webhooks/github. See deploy/github-app/README.md for step-by-step setup.
Comparisons
- Bernstein vs. GitHub Agent HQ — open-source alternative to GitHub's multi-agent system
- Full comparison index — Conductor, Crystal, Stoneforge, single-agent baseline, and more
- Benchmark data — 1.78× faster, 23% lower cost vs. single-agent baseline
Origin
Built during a 47-hour sprint: 12 AI agents on a single laptop, 737 tickets closed (15.7/hour), 826 commits. Full write-up. Every design decision here is a direct response to those findings.
Support Bernstein
Love Bernstein? Support the project by becoming a sponsor. GitHub Sponsors and Open Collective let you give back with zero friction — every contribution helps us ship faster.
Sponsorship Tiers
| Tier | Amount | Benefits |
|---|---|---|
| Supporter | $5/mo | Your name in the supporters list |
| Priority Support | $25/mo | Priority response to your GitHub issues |
| Featured | $100/mo | Your logo in the README + priority support |
| Advocate | $500/mo | Logo + monthly consulting call + priority support |
Sponsor Now
- GitHub Sponsors — support via GitHub, integrated billing
- Open Collective — transparent spending, receipt for companies
All sponsorship proceeds fund development, infrastructure, and open-source sustainability.
Contributing
PRs welcome. CONTRIBUTING.md | Issues
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bernstein-1.0.0.tar.gz.
File metadata
- Download URL: bernstein-1.0.0.tar.gz
- Upload date:
- Size: 9.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6f78a3156887bdd53c7e2e1d41af91b16f1734c4755359ffbad6df79be25978e
|
|
| MD5 |
b23adf5deb769e85cc095a1f42d85c54
|
|
| BLAKE2b-256 |
a9cd96596c260fc91c34922ce74dcc67c7d90404c4c37687332d795a9fd2341f
|
Provenance
The following attestation bundles were made for bernstein-1.0.0.tar.gz:
Publisher:
publish.yml on chernistry/bernstein
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bernstein-1.0.0.tar.gz -
Subject digest:
6f78a3156887bdd53c7e2e1d41af91b16f1734c4755359ffbad6df79be25978e - Sigstore transparency entry: 1193555904
- Sigstore integration time:
-
Permalink:
chernistry/bernstein@382bfe511752510530aaf8e2939e44efcc098bcc -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/chernistry
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@382bfe511752510530aaf8e2939e44efcc098bcc -
Trigger Event:
push
-
Statement type:
File details
Details for the file bernstein-1.0.0-py3-none-any.whl.
File metadata
- Download URL: bernstein-1.0.0-py3-none-any.whl
- Upload date:
- Size: 833.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2e60a6d4e536e7183d9b9aac9756a22cf43b56d70613d4f864ba552cf610220
|
|
| MD5 |
840e1d8aff85f5d86988bfd48f011784
|
|
| BLAKE2b-256 |
87925cf8154d7e6b247c81ea0b729f080edd3914912cb47b038bbc4c481b3086
|
Provenance
The following attestation bundles were made for bernstein-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on chernistry/bernstein
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bernstein-1.0.0-py3-none-any.whl -
Subject digest:
c2e60a6d4e536e7183d9b9aac9756a22cf43b56d70613d4f864ba552cf610220 - Sigstore transparency entry: 1193555939
- Sigstore integration time:
-
Permalink:
chernistry/bernstein@382bfe511752510530aaf8e2939e44efcc098bcc -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/chernistry
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@382bfe511752510530aaf8e2939e44efcc098bcc -
Trigger Event:
push
-
Statement type: