Skip to main content

Claude Code plugins for autonomous development workflows

Project description

stellars-claude-code-plugins

GitHub Actions PyPI version Total PyPI downloads Python 3.12

stellars-claude-code-plugins marketplace overview - 6 plugins grouped by category

Your AI agent will cut corners. This is the forcing function.

You ask Claude to "improve error handling." Claude says "Fixed it." Two files changed, no tests run, edge cases broken. Or it ships an SVG infographic with overlapping text and contrast failures. Or it passes a document past a reviewer who'd tear it apart.

This marketplace makes Claude work like a disciplined engineer instead. Each plugin enforces a specific discipline: research before implement, validate before ship, ground every claim, audit every iteration.

# Force Claude through research, plan, test, review, and audit before claiming done
/autobuild:run improve error handling in the API layer

#  -> writes PROGRAM.md (objective + scope)
#  -> writes BENCHMARK.md (measurable score)
#  -> asks for your approval
#  -> implements
#  -> runs tests
#  -> reviews against the benchmark
#  -> records evidence in YAML audit log
/plugin marketplace add stellarshenson/claude-code-plugins
/plugin install autobuild@stellarshenson-marketplace

Read the long-form articles: Your AI Agent Will Cut Corners. Here's How to Stop It and Stop Fixing Your AI's SVGs. For real examples (60+ production SVGs, 4 worked devils-advocate analyses, 3 autobuild iteration trajectories, a 1.0-CV grounding result), see showcase/.

The full marketplace - six disciplines

autobuild is the spear. The same forcing-function logic powers five more plugins, each enforcing a different kind of discipline on Claude. Install them individually or as a bundle.

Plugin What it solves
autobuild Executes code and artefact builds toward an objective with iterations driven by a calculated outcome benchmark - enforces structured phases with multi-agent review
devils-advocate Produces high-quality documents for a specific audience using a scientific, measured, iterative approach - quantified critique with Fibonacci risk scoring and per-iteration residual measurement
svg-infographics Produces high-quality standardised SVG infographics - grid-first design, theme-driven styling, dark/light mode, 5 routing modes (straight/L/L-chamfer/spline/manifold) with A* auto-routing, callout placement solver, chart generation, and 6 automated checkers
datascience Produces high-quality data science projects and notebooks following consistent standards - scaffolds projects from copier templates, enforces notebook structure, applies rich output styling, and supports prompt engineering techniques
document-processing Processes documents according to user requests with grounding in source materials - source tracing, compliance checking, PDF automation
journal Produces a work journal marking key changes, implementations, and decisions - append-only audit trail with continuous numbering, archiving, and deterministic journal-tools CLI for validation, sorting, and word-count enforcement

autobuild

autobuild 8-phase lifecycle: research, hypothesis, plan, implement, test, review, record, next

Runs structured multi-iteration development cycles where each iteration passes through a full phase lifecycle with quality gates. A program defines what to build, a benchmark measures progress, and the engine enforces the workflow until the objective is met or iterations are exhausted.

  • Shallow fixes - forces research and hypothesis before implementation
  • Scope creep - plan locks scope, review catches deviations
  • Lost context - hypothesis catalogue and failure context persist across iterations
  • Unchecked quality - two independent gates (readback + gatekeeper) per phase
  • No accountability - every phase records agents, outputs, and verdicts in YAML audit logs
  • Benchmark gaming - guardian agent checks for benchmark-specific tuning vs genuine improvement

Skills: autobuild (orchestrator), program-writer, benchmark-writer

Workflow types

Type Phases Use when
full RESEARCH → HYPOTHESIS → PLAN → IMPLEMENT → TEST → REVIEW → RECORD → NEXT Feature work, improvements
fast PLAN → IMPLEMENT → TEST → REVIEW → RECORD → NEXT Clear objective, no exploration needed
gc PLAN → IMPLEMENT → TEST → RECORD → NEXT Cleanup, refactoring
hotfix IMPLEMENT → TEST → RECORD Targeted bug fix
planning RESEARCH → PLAN → RECORD → NEXT Work breakdown (auto-chains before full)

Usage

# Describe what you want - the plugin handles the rest
/autobuild improve error handling in the API layer

The plugin writes PROGRAM.md and BENCHMARK.md from your prompt, asks you to approve, then runs the orchestrator autonomously.

See autobuild/README.md for the full phase lifecycle, agent architecture, and configuration details.

devils-advocate

devils-advocate Fibonacci risk matrix and sample concerns iterating to resolved

Systematically critiques documents from the perspective of their toughest audience. Builds a devil persona, harvests verifiable facts, generates a risk-scored concern catalogue, and iterates corrections until residual risk is acceptable.

Skills: setup (build persona + fact repository), evaluate (concern catalogue + baseline scorecard), iterate (apply corrections or re-score), run (full workflow end-to-end)

Risk scoring uses a Fibonacci scale (1-8) for likelihood and impact, producing risk scores from 1-64. Each concern is scored 0-100% on how well the document addresses it, and the residual risk (what remains unaddressed) drives iteration priority.

Usage

# Full end-to-end workflow
/devils-advocate:run

# Step by step
/devils-advocate:setup       # Build persona, harvest facts
/devils-advocate:evaluate    # Generate concerns + baseline scorecard
/devils-advocate:iterate     # Apply corrections, re-score (repeat)

See devils-advocate/README.md for scoring formula details, artefact format, and the full concern catalogue methodology.

svg-infographics

svg-infographics 6-phase workflow and 8 shipped CLI tools (validators + calculators)

Creates production-quality SVG infographics with a mandatory 6-phase workflow (research, grid, scaffold, content, finishing, validation). Every coordinate is Python-calculated, every colour traces to an approved theme swatch, and six validation tools check overlaps, WCAG contrast, alignment, connector quality, CSS compliance, and pairwise connector collisions before delivery.

Five connector routing modes (straight, l, l-chamfer, spline, manifold) with grid A* auto-routing around obstacles, container-scoped routing within specific shapes, straight-line collapse for near-aligned endpoints, and stem preservation guaranteeing clean cardinal segments behind arrowheads. Callout placement via greedy solver with leader and leaderless modes. Charts via pygal with dual light/dark palette and WCAG contrast audit.

Boolean / margin operations on path shapes (boolean calculator): headless Inkscape Path menu - union, intersection, difference, xor (Exclusion) plus one-step buffer (Inset / Outset), cutout (cut-with-margin: subtract B inflated by N units from A), and outline (closed annulus of width N around a shape's boundary). The cutout-with-margin and outline-as-band ops are not exposed as one-button operations by Inkscape, Illustrator, Affinity, Figma, Sketch, or CorelDRAW - bundling them as primitives is the main agentic value-add. Operates polygon-only via shapely; Bezier / Arc inputs flatten to polylines, with the lossy round-trip surfaced as a CURVE-FLATTENED warning through the gate. Supports --replace-id ID for in-place rewrite of a named element's d= attribute.

Stop-and-think warning-ack gate: every producer tool (calc_connector, charts, drawio_shapes, empty-space, finalize) blocks its primary output whenever any warning fires. The caller must acknowledge each warning explicitly with --ack-warning TOKEN=reason - one flag per warning, terse reasoning required, no bulk override. Tokens are deterministic per invocation so reruns reproduce them. Forces a conscious per-finding decision instead of letting warnings scroll past unread.

Skills: svg-designer (fork-context design agent with tool palette, 6-phase workflow, design rules, validation gates), theme (palette approval + swatch generation)

Usage

# Create infographic(s) with full workflow
/svg-infographics:create card grid showing 4 platform modules

# Generate theme swatch for approval
/svg-infographics:theme corporate blue palette

# Run validation on existing SVGs
/svg-infographics:validate docs/images/*.svg

# Fix issues in existing SVGs (layout / style / contrast / connectors / all)
/svg-infographics:fix docs/images/overview.svg style
/svg-infographics:fix docs/images/overview.svg layout

# Additive decoration pass on existing SVGs
/svg-infographics:beautify docs/images/overview.svg medium

Includes 60+ production SVG examples, 13 CLI tools (6 validators + 7 calculators including the boolean / margin ops), and theme swatches. See svg-infographics/README.md for the capability groups and workflow details.

datascience

datascience project scaffold and notebook section pipeline (header, GPU, imports, config, data, model, eval)

Enforces data science project standards derived from production notebook workflows. Five skills auto-trigger when working with notebooks, datasets, rich output, prompts, or progress bars. Nine commands fix existing code, scaffold new projects, and apply prompt engineering techniques.

Skills: datascience (project conventions), notebook-standards (section order, GPU-first), rich-output (semantic colors), prompt-engineering (7 research-backed techniques), progressbars (tqdm/rich)

Usage

# Create a new project from copier template
/datascience:new-project

# Fix an existing notebook to comply with standards
/datascience:fix-notebook notebooks/01-kj-analysis.py

# Apply rich styling fixes (wrong colors, multiple prints)
/datascience:apply-style notebooks/02-kj-train.py

# Add or fix progress bars (choose tqdm or rich)
/datascience:apply-progressbar notebooks/02-kj-train.py

# Apply prompt engineering technique (CoT, CoD, ToT, few-shot, etc.)
/datascience:apply-prompt-technique

# Full psychological prompting stack for hard problems
/datascience:challenge

# Port legacy project to copier-data-science template
/datascience:fix-project

See datascience/README.md for the full list of standards enforced.

journal

journal append-only timeline with archive and continuous numbering

Project journal management with append-only entry format, continuous numbering, and automatic archiving. Auto-triggers on journal-related phrases (see below) or after substantive work, maintaining a consistent audit trail in .claude/JOURNAL.md. Includes a deterministic journal-tools CLI for validation, sorting, and word-count enforcement — the three pure-string subcommands run with no generative AI in the loop, and standardize orchestrates a focused claude -p subprocess per offender to repair word-count drift on entries check warned on.

Skill: journal (auto-triggered by the phrases below or after finishing substantive work)

Auto-trigger phrases

Command Triggers on
/journal:update "update journal", "add journal entry", "add entry", "log this", "journal this", "record this in the journal"
/journal:create "create journal", "init journal", "start journal", "new journal" (refuses if file already exists)
/journal:archive "archive journal", "prune journal", "compact journal" (auto-suggests when >40 entries)
/journal:standardize "standardize journal", "fix journal entry tiers", "repair journal" (run after journal-tools check reports word-count warnings)

Clear split: create = scaffold-from-empty one-time, update = every write after that (append new entry or extend the last one), archive = runs the CLI archiver, standardize = ACP-driven word-count repair (oversized Standard → mark Extended or condense; oversized Extended → condense; spurious marker → drop).

Usage

# Add a new entry — use this for 99% of journal writes
/journal:update added retry logic to API client

# Initialise a fresh journal (only when JOURNAL.md does not yet exist)
/journal:create backfill from this session

# Archive older entries (keeps last 20 in main, appends rest to JOURNAL_ARCHIVE.md)
/journal:archive

# Validate format, numbering, and word counts (deterministic CLI)
journal-tools check .claude/JOURNAL.md

# Re-number entries sequentially (fixes gaps or reorders)
journal-tools sort .claude/JOURNAL.md --dry-run

# Repair word-count drift via an ACP `claude -p` subprocess per offender
/journal:standardize    # chains: list -> per-entry prompt -> apply decision

Two word-count tiers: Standard (~70-120 words, the default) and Extended (~250-350 words, ONLY when the user explicitly asks or the work is an architectural decision / platform migration / multi-iteration debug). The checker emits warnings (not errors) when entries exceed the standard target or the extended max — length is a nudge, never a block.

See journal/README.md for entry format, CLI tools, and archiving rules.

document-processing

document-processing 3-stage flow: sources, grounding, compliant cited output

Structured document processing with source grounding and quality control. Takes input documents through a verified workflow (analyze, draft, ground, uniformize) and produces outputs where every factual claim is traceable to source material.

Skills (each pairs with a same-named command): process (build a deliverable from sources - 4-phase workflow), grounding (the one verification flow - runs the CLI; single claim / one document / batch via source_map.yaml; no compliance), validate (grounding + tone/style/length/format compliance), update (update an existing output, with a mandatory CLI-grounding closing pass), pdf (toolkit - extract / merge / split / forms / OCR / batch). Grounding is delegated, not duplicated: validate, process's verify phase, and update's closing step all call the grounding skill.

CLI: ships the document-processing command with three-layer lexical grounding (regex + Levenshtein + BM25) plus an optional fourth semantic layer (multilingual-e5 + FAISS). Every hit returns line / column / paragraph / page / context snippet — the agent cites without rereading. Saves tokens: measured 64-86% reduction vs batched generative grounding on real sources. Semantic layer is opt-in via pip install 'stellars-claude-code-plugins[semantic]' + document-processing setup.

Native source format support (Release F+): .txt, .md, .rst, .pdf (text), .docx, .odt, .rtf, .html extracted directly via pypdf / python-docx / odfpy / striprtf. Scanned PDFs go through a deterministic fallback chain: same-stem sibling lookup (.ocr.txt > .txt > .docx > ...) → optional auto-OCR via [ocr] extras (pytesseract + pdf2image + system tesseract; agent supplies --ocr-lang) → vision-OCR by Claude via the Read tool with <stem>.ocr.txt save convention. Auto-OCR results are quality-banded (good / candidate / failed) with a deterministic stop-and-think gate that surfaces per-source warnings the agent must ack with reasoning before grounding consumes the text.

Data-science calibrated: the classifier was tuned via a six-iteration autobuild cycle with a composite benchmark score and 3-fold cross-validation on three held-out academic papers (Liu 2023, Ye 2024, Han 2024 - 14 labelled claims each). Final CV mean accuracy 1.0 with zero overfit gap. 29 tunable parameters exposed in config.yaml, documented per field, overridable via .stellars-plugins/config.yaml. Full program definition, benchmark, hypothesis + falsifiers, forensic report, CV results, and corpus data archived under references/grounding-optimisation/.

Usage

# Build a deliverable from input documents
/document-processing:process synthesize expert opinions into position paper

# Update existing output with new source material (re-grounds the changed content)
/document-processing:update add new hearing transcript to timeline

# Validate a document against rules and against its sources
/document-processing:validate

# Bare grounding - single claim, one document, or a batch via source_map.yaml
/document-processing:grounding

# First-run: interactive opt-in prompt for optional semantic grounding
document-processing setup

# Direct CLI: ground a single claim (all four layers when semantic enabled)
document-processing ground \
  --claim "Kubernetes runs on 12 nodes" \
  --source docs/source.md \
  --threshold 0.85 --bm25-threshold 0.5 --semantic-threshold 0.85 --json

# Batch ground N claims from JSON, force semantic on for this call
document-processing batch-ground \
  --claims validation/claims.json \
  --source docs/source.md \
  --output validation/grounding-report.md \
  --semantic on

See document-processing/README.md for the grounding methodology, folder structure, and PDF processing details.

Install

The library ships the deterministic CLIs that every plugin depends on — install it alongside the plugin marketplace. Without the library the skills fall back to manual work and lose all automation.

pip install stellars-claude-code-plugins

Provides these binaries:

Binary Used by
orchestrate autobuild
svg-infographics svg-infographics, devils-advocate (visuals)
render-png svg-infographics (Playwright-based SVG → PNG)
journal-tools journal (check / sort / archive / standardize)
document-processing document-processing (ground / batch-ground, three-layer grounding)

As a Claude Code plugin marketplace:

/plugin marketplace add stellarshenson/claude-code-plugins

Building a new plugin

Plugins are pure configuration - no Python code required. Create a directory with skills and register it in the marketplace:

my-plugin/
  .claude-plugin/plugin.json           # Plugin registration and skill triggers
  skills/
    my-skill/SKILL.md                  # Skill definition with description and instructions

The plugin.json registers your skills with Claude Code, defining when they trigger and what tools they have access to. Each SKILL.md contains the instructions Claude follows when the skill is invoked. The shared orchestration engine (pip install stellars-claude-code-plugins) provides the orchestrate CLI command that handles state management, FSM transitions, gate execution, and audit logging.

Register your plugin in the marketplace by adding an entry to .claude-plugin/marketplace.json.

Development

make install          # create venv, install deps, editable install
make test             # run tests
make lint             # ruff format + check
make format           # auto-fix formatting
make build            # clean, test, bump version, build wheel
make publish          # build + twine upload to PyPI

License

MIT License

Project details


Release history Release notifications | RSS feed

This version

1.5.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stellars_claude_code_plugins-1.5.0.tar.gz (512.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stellars_claude_code_plugins-1.5.0-py3-none-any.whl (425.6 kB view details)

Uploaded Python 3

File details

Details for the file stellars_claude_code_plugins-1.5.0.tar.gz.

File metadata

File hashes

Hashes for stellars_claude_code_plugins-1.5.0.tar.gz
Algorithm Hash digest
SHA256 335c89d67e1b07b75e41a28633ab69e2d139e1ad1e89480c7390438ac6a378c3
MD5 25d6d1cc5a329b9eea2528ffa947a70e
BLAKE2b-256 4bdb59df9c350862d04a03648ca1fa737174a5c8bde262ddae0b45cc6b3b7a5e

See more details on using hashes here.

File details

Details for the file stellars_claude_code_plugins-1.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for stellars_claude_code_plugins-1.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f69c77bc3c05ff59458ca6be3835a16cb1fe15a1203ddf925d1ad89d367b902a
MD5 5f0049a42d2eee06393360d5d9c2fd40
BLAKE2b-256 17b5126e6a19785ef77872d6c19f1200fe7946d9e3550478d186be028b408cd2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page