Claude Code plugins for autonomous development workflows
Project description
stellars-claude-code-plugins
AI coding agents generate impressive code but cut corners when left unsupervised - skipping tests, losing context between iterations, shipping shallow fixes that pass benchmarks without addressing root causes. The longer an agent runs autonomously, the more these failures compound.
This project provides a shared YAML-driven orchestration engine that pulls agents through structured phases with independent quality gates at every boundary. Instead of relying on the agent's self-discipline, the engine enforces research before implementation, hypothesis tracking across iterations, and multi-agent review before any code ships.
[!TIP] Each plugin provides only YAML configuration files. The shared orchestration engine in
stellars_claude_code_pluginshandles all execution logic - FSM transitions, gate validation, multi-agent coordination, and state management.
[!NOTE] Read the full article on the approach: Your AI Agent Will Cut Corners. Here's How to Stop It.
What it solves
- Shallow fixes - forces research and hypothesis before implementation
- Scope creep - plan locks scope, review catches deviations
- Lost context - hypothesis catalogue and failure context persist across iterations
- Unchecked quality - two independent gates (readback + gatekeeper) per phase
- No accountability - every phase records agents, outputs, and verdicts in YAML audit logs
- Benchmark gaming - guardian agent checks for benchmark-specific tuning vs genuine improvement
Plugins
| Plugin | Skills | Description |
|---|---|---|
| auto-build-claw | 3 | Autonomous build iteration orchestrator with multi-agent review |
| devils-advocate | 5 | Critical document analysis with persona-driven risk scoring |
| datascience | 3 | Notebook structure, rich output styling, copier scaffolding, compliance fixes |
auto-build-claw
Runs structured multi-iteration development cycles where each iteration passes through a full phase lifecycle with quality gates. A program defines what to build, a benchmark measures progress, and the engine enforces the workflow until the objective is met or iterations are exhausted.
Skills: auto-build-claw (orchestrator), program-writer, benchmark-writer
Workflow types
| Type | Phases | Use when |
|---|---|---|
full |
RESEARCH -> HYPOTHESIS -> PLAN -> IMPLEMENT -> TEST -> REVIEW -> RECORD -> NEXT | Feature work, improvements |
fast |
PLAN -> IMPLEMENT -> TEST -> REVIEW -> RECORD -> NEXT | Clear objective, no exploration needed |
gc |
PLAN -> IMPLEMENT -> TEST -> RECORD -> NEXT | Cleanup, refactoring |
hotfix |
IMPLEMENT -> TEST -> RECORD | Targeted bug fix |
planning |
RESEARCH -> PLAN -> RECORD -> NEXT | Work breakdown (auto-chains before full) |
Usage
# Describe what you want - the plugin handles the rest
/auto-build-claw improve error handling in the API layer
The plugin writes PROGRAM.md and BENCHMARK.md from your prompt, asks you to approve, then runs the orchestrator autonomously.
See auto-build-claw/README.md for the full phase lifecycle, agent architecture, and configuration details.
devils-advocate
Systematically critiques documents from the perspective of their toughest audience. Builds a devil persona, harvests verifiable facts, generates a risk-scored concern catalogue, and iterates corrections until residual risk is acceptable.
Skills: setup (build persona + fact repository), evaluate (concern catalogue + baseline scorecard), iterate (apply corrections or re-score), run (full workflow end-to-end)
Risk scoring uses a Fibonacci scale (1-8) for likelihood and impact, producing risk scores from 1-64. Each concern is scored 0-100% on how well the document addresses it, and the residual risk (what remains unaddressed) drives iteration priority.
Usage
# Full end-to-end workflow
/devils-advocate:run
# Step by step
/devils-advocate:setup # Build persona, harvest facts
/devils-advocate:evaluate # Generate concerns + baseline scorecard
/devils-advocate:iterate # Apply corrections, re-score (repeat)
See devils-advocate/README.md for scoring formula details, artefact format, and the full concern catalogue methodology.
datascience
Enforces data science project standards derived from production notebook workflows. Three skills auto-trigger when working with notebooks, datasets, or rich output. Six commands fix existing code or scaffold new projects.
Usage
# Create a new project from copier template
/datascience:new-project
# Fix an existing notebook to comply with standards
/datascience:fix-notebook notebooks/01-kj-analysis.py
# Apply rich styling fixes (wrong colors, multiple prints)
/datascience:apply-style notebooks/02-kj-train.py
# Port legacy project to copier-data-science template
/datascience:fix-project
See datascience/README.md for the full list of standards enforced.
Install
pip install stellars-claude-code-plugins
As a Claude Code plugin marketplace:
/plugin marketplace add stellarshenson/claude-code-plugins
Architecture
stellars_claude_code_plugins/ # Shared engine (pip installable)
engine/
fsm.py # Phase lifecycle state machine
model.py # Typed YAML model loader + validator
orchestrator.py # Complete orchestration engine
resources/
workflow.yaml # Default iteration types and phase sequences
phases.yaml # Default phase templates, agents, gates
app.yaml # Default display text and CLI config
auto-build-claw/ # Plugin: autonomous build iterations
.claude-plugin/plugin.json # Plugin registration
skills/
auto-build-claw/SKILL.md # Orchestrator skill definition
program-writer/SKILL.md # Program definition skill
benchmark-writer/SKILL.md # Benchmark definition skill
devils-advocate/ # Plugin: critical document analysis
.claude-plugin/plugin.json # Plugin registration
skills/
setup/SKILL.md # Build persona + fact repository
evaluate/SKILL.md # Concern catalogue + scorecard
improve/SKILL.md # Decide how to address concerns
iterate/SKILL.md # Apply corrections, re-score
run/SKILL.md # Full workflow end-to-end
datascience/ # Plugin: data science standards
.claude-plugin/plugin.json # Plugin registration
skills/
datascience/SKILL.md # Project conventions (auto-triggered)
notebook-standards/SKILL.md # Notebook structure (auto-triggered)
rich-output/SKILL.md # Rich styling patterns (auto-triggered)
commands/
new-project.md # Scaffold from copier template
notebook.md # Create structured notebook
review.md # Compliance review
apply-style.md # Apply rich output styling fixes
fix-notebook.md # Restructure notebook to standards
fix-project.md # Port/update project to copier template
.claude-plugin/marketplace.json # Plugin marketplace registry
Building a new plugin
Plugins are pure configuration - no Python code required. Create a directory with skills and register it in the marketplace:
my-plugin/
.claude-plugin/plugin.json # Plugin registration and skill triggers
skills/
my-skill/SKILL.md # Skill definition with description and instructions
The plugin.json registers your skills with Claude Code, defining when they trigger and what tools they have access to. Each SKILL.md contains the instructions Claude follows when the skill is invoked. The shared orchestration engine (pip install stellars-claude-code-plugins) provides the orchestrate CLI command that handles state management, FSM transitions, gate execution, and audit logging.
Register your plugin in the marketplace by adding an entry to .claude-plugin/marketplace.json.
Development
make install # create venv, install deps, editable install
make test # run 212 tests
make lint # ruff format + check
make format # auto-fix formatting
make build # clean, test, bump version, build wheel
make publish # build + twine upload to PyPI
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stellars_claude_code_plugins-0.8.70-py3-none-any.whl.
File metadata
- Download URL: stellars_claude_code_plugins-0.8.70-py3-none-any.whl
- Upload date:
- Size: 76.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9fafa1672ba155c4f219572bded19aff588092d7a2b1dc3aaa42853bf1a2ce69
|
|
| MD5 |
a1097a15f8ffb0a099b93ef37dcad64b
|
|
| BLAKE2b-256 |
a6f1122c1ec835af14a376686297898e8d9466ae37508053357c64ff0b280c00
|