Skip to main content

Runtime-agnostic, document-first orchestration for AI-driven software delivery.

Project description

ai_driven_dev_v2

Runtime-agnostic orchestration for document-first AI software delivery.

Status: implemented local orchestration system with active architecture, contracts, adapters, validators, harness/eval tooling, and an installable Python CLI. Current known gaps are documented as roadmap work or explicit manual live/release evidence prerequisites, not as hidden bootstrap assumptions.

What this project is

ai_driven_dev_v2 (AIDD) is a stage-based workflow system for governed AI-assisted software work.

It rebuilds the useful parts of ai_driven_dev so they are not coupled to a single runtime. The project keeps:

  • explicit workflow stages,
  • durable Markdown artifacts,
  • validator gates,
  • self-repair after invalid stage outputs,
  • user interview loops,
  • native runtime log visibility,
  • harness and eval support from the beginning.

The canonical stage flow is:

idea -> research -> plan -> review-spec -> tasklist -> implement -> review -> qa

Why AIDD exists

Most agentic coding systems become tightly bound to one host runtime, one prompt surface, or one plugin API. That makes them harder to port, harder to debug, and harder to evaluate.

AIDD separates:

  • core workflow semantics from runtime integration,
  • document contracts from model formatting habits,
  • operator experience from any one runtime CLI,
  • harness/eval from ad hoc prompt experimentation.

What makes AIDD different

  • Runtime-agnostic core
    The core never assumes Claude Code, Codex, OpenCode, or any other runtime-specific API.

  • Markdown-first stage IO
    Stages read and write human-reviewable Markdown documents. Validation happens after generation.

  • Validation and self-repair
    Invalid outputs do not silently pass. The system validates, writes a repair brief, and reruns within a bounded budget.

  • Interview-aware execution
    If a stage needs clarification, the runtime can ask the user through the CLI and/or durable questions.md / answers.md files.

  • Native runtime log visibility
    The CLI is designed to stream raw runtime logs as closely as possible to the runtime's own UX.

  • Harness and eval built in
    Deterministic scenarios, manual live E2E audits, graders, and log analysis are part of the product architecture.

Primary user stories

The project is anchored in these outcomes:

  • an operator can run the same governed flow on different runtimes;
  • a team can inspect and edit stage artifacts as Markdown files;
  • invalid stage outputs are repaired before the workflow advances;
  • the system asks the user clarifying questions when the task is underspecified;
  • a maintainer can add a new runtime adapter without rewriting the core;
  • an evaluator can run deterministic and live E2E scenarios with log analysis.

See docs/product/user-stories.md for the full set.

Runtime support (current)

Workflow and stage execution today:

  • aidd run supports runtimes generic-cli, claude-code, codex, and opencode.
  • aidd stage run supports runtimes generic-cli, claude-code, codex, and opencode.

Runtime probes in aidd doctor:

  • generic-cli
  • claude-code
  • codex
  • opencode

Unsupported runtime handling:

  • aidd run and aidd stage run fail fast with non-zero exit and unsupported-runtime classification when the runtime id is unknown.

Future bridge target:

  • pi-mono

Architecture in one sentence

operator CLI -> AIDD core -> adapter -> runtime -> workspace documents

The key architecture documents are:

  • docs/architecture/target-architecture.md
  • docs/architecture/adapter-protocol.md
  • docs/architecture/document-contracts.md
  • docs/architecture/runtime-matrix.md
  • docs/architecture/eval-harness-integration.md
  • docs/architecture/distribution-and-development.md
  • docs/architecture/operator-frontend.md
  • docs/architecture/project-set-workspace.md

What is in this repository today

This repository includes:

  • root product and contributor documentation,
  • a wave/epic/slice/local-task roadmap and active backlog queue,
  • stage and document contracts,
  • stage prompt packs,
  • .agents/skills/ for Codex-style development workflows,
  • deterministic and live scenario manifests,
  • CI and release workflows,
  • an installable Python package and CLI,
  • runtime adapters, validators, core orchestration, run inspection, harnesses, and eval reports.

The following parts are still intentionally in-progress:

  • live interview parity on installed public-repository scenarios,
  • broader installed live lane coverage beyond the first canonical scenario.

Those gaps are deliberate current scope boundaries, not absent foundations.

Live E2E remains available as a manual external-audit system, but it is no longer part of CI or release gating.

Installation from source

Prerequisites

  • Python 3.12+
  • uv
  • provider CLIs you want to run or probe, such as Claude Code, Codex, or OpenCode
  • provider authentication already configured outside AIDD
  • optional AIDD-compatible wrapper commands for advanced adapter-flags mode

Bootstrap the repo locally

uv sync --extra dev
uv run aidd --help
uv run aidd doctor
uv run --extra dev pytest -q

Create a starter workspace

uv run aidd init --work-item WI-001

This creates a local .aidd/ workspace tree with stage directories and placeholder artifacts.

Supported Local Operator Path

The product operator path starts from a local project root. Install or run AIDD locally, then enter the target project directory before creating workflow state.

From an installed command:

cd /path/to/local-project
aidd doctor --config /path/to/aidd.example.toml
aidd init --work-item WI-001 --root .aidd
aidd run --work-item WI-001 --runtime generic-cli --root .aidd --config /path/to/aidd.example.toml
aidd ui --work-item WI-001 --root .aidd --config /path/to/aidd.example.toml

From a source checkout without installing globally, replace aidd with uv tool run --from /path/to/ai_driven_dev_v2 aidd.

Inspect local workflow evidence with either the UI or the CLI:

aidd run show --work-item WI-001 --root .aidd
aidd run logs --work-item WI-001 --stage plan --root .aidd
aidd run artifacts --work-item WI-001 --stage plan --root .aidd

The .aidd/ directory stays inside the local project root. Treat it as project-local operator state and do not commit it unless a separate repository policy explicitly says so.

aidd init --github-issue <url> is out of product scope. Public GitHub repositories are live E2E targets and support/reporting evidence sources only, not a product intake path.

Planned distribution channels

The intended release channels are:

  • PyPI for pipx install ai-driven-dev-v2
  • uv tool install ai-driven-dev-v2
  • container images such as ghcr.io/grinrus/ai-driven-dev-v2
  • source checkout for contributors and CI

Runtime binaries remain external dependencies. AIDD does not bundle Claude Code, Codex, OpenCode, or other runtimes.

For workflow or stage execution, Codex and OpenCode default to native provider CLI execution. Advanced operators can still configure an AIDD-compatible wrapper command with mode = "adapter-flags" when they need a custom execution surface.

For manual live E2E, the canonical operator-audit path is:

  • build a local wheel from the current checkout;
  • install it with uv tool;
  • enter the pinned target repository;
  • run installed aidd there with .aidd/ rooted inside that repository.

Container image tagging rules for release tags:

  • publish vX.Y.Z, vX.Y, and vX;
  • publish sha-<git-sha> for traceability;
  • publish latest only for stable tags without prerelease suffixes.

PyPI publishing tag rules:

  • tag format must be v<major>.<minor>.<patch> with optional PEP 440 suffix (aN, bN, rcN, .postN, .devN);
  • release tag must exactly match v<project.version> from pyproject.toml;
  • tag-triggered publish jobs fail fast when tag format or tag/version alignment is invalid.

Quickstart

# Install the local development environment
uv sync --extra dev

# Inspect runtime availability from local config
uv run aidd doctor

# Create a work-item workspace
uv run aidd init --work-item WI-001

# Read the roadmap before implementing
sed -n '1,200p' docs/backlog/roadmap.md

# Run the smoke tests
uv run --extra dev pytest -q

Current CLI surface

The CLI exposes the current product surface:

aidd doctor
aidd init --work-item WI-001
aidd run --work-item WI-001 --runtime generic-cli
aidd stage run plan --work-item WI-001 --runtime generic-cli
aidd eval run harness/scenarios/live/typer-styled-help-alignment.yaml --runtime codex

Today:

  • doctor is functional,
  • init is functional,
  • run executes workflow progression for generic-cli, claude-code, codex, and opencode,
  • stage run executes single-stage orchestration for generic-cli, claude-code, codex, and opencode,
  • run and stage run fail fast for unknown runtime ids with unsupported-runtime classification,
  • eval run executes the harness lifecycle and writes result bundles (summary.md, verdict.md, runtime.log, validator artifacts, stage-timing.md, and self-repair-matrix.md),
  • live eval run scenarios under harness/scenarios/live/ install a local wheel via uv tool, run AIDD from the target repository root, and use maintained live providers only (codex, opencode, and the claude-code smoke lane in Wave 13).

Operator documentation

For installation, diagnostics, and issue reporting workflows, use:

  • docs/operator-handbook.md
  • docs/operator-troubleshooting.md
  • docs/operator-support-policy.md

Live E2E catalog

The repository includes a curated live E2E set built on public GitHub repositories.

In this repository, live E2E means a manual installed-operator audit, not a CI or release lane and not the same thing as smoke or adapter conformance.

Repository set:

  • fastapi/typer
  • encode/httpx
  • simonw/sqlite-utils
  • honojs/hono

See:

  • docs/e2e/live-e2e-catalog.md
  • docs/e2e/scenario-matrix.md
  • harness/scenarios/live/

How to develop this project

Read in this order:

  1. AGENTS.md
  2. docs/product/user-stories.md
  3. docs/backlog/roadmap.md
  4. docs/architecture/target-architecture.md
  5. the nearest nested AGENTS.md
  6. the relevant skill in .agents/skills/

Then use the standard loop:

uv sync --extra dev
uv run --extra dev ruff check .
uv run --extra dev python -m mypy src
uv run --extra dev pytest -q

Repository map

  • src/aidd/ — Python package with core orchestration, adapters, validators, CLI, harness, and evals
  • contracts/ — stage and document contracts
  • prompt-packs/ — file-based stage prompts
  • docs/product/ — product framing and user stories
  • docs/architecture/ — fixed technical decisions and protocols
  • docs/e2e/ — live E2E catalog
  • docs/backlog/ — roadmap and active backlog
  • harness/scenarios/ — smoke and live scenario manifests
  • .agents/skills/ — reusable team skills for Codex-style development
  • tests/ — deterministic unit, integration, docs, adapter, harness, and eval checks
  • MANIFEST.md — historical archive contents snapshot, not the current source-of-truth inventory

Roadmap

The canonical plan lives in docs/backlog/roadmap.md.

The short actionable queue lives in docs/backlog/backlog.md.

Compatibility policy

Compatibility guarantees for Python versions and operating platforms live in:

  • docs/compatibility-policy.md

Contributing

See CONTRIBUTING.md.

The short version:

  • pick a local task from the backlog,
  • keep the change aligned with the user stories,
  • update docs/contracts/prompts when behavior changes,
  • keep the core runtime-agnostic,
  • run the smallest relevant checks before opening a PR.

License

This project is licensed under the Apache License 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_driven_dev_v2-0.1.0a1.tar.gz (440.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_driven_dev_v2-0.1.0a1-py3-none-any.whl (339.5 kB view details)

Uploaded Python 3

File details

Details for the file ai_driven_dev_v2-0.1.0a1.tar.gz.

File metadata

  • Download URL: ai_driven_dev_v2-0.1.0a1.tar.gz
  • Upload date:
  • Size: 440.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ai_driven_dev_v2-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 22f2bd26b81483056d40354e222592c5d31a4a924896a9de82f3b67b7d0b7f6f
MD5 e1204d3dc6e2369ffcb7e214fadec47f
BLAKE2b-256 690881454bf8c3b69f17d2f5be89fc20cb93ddd5f422bc29be525155cb5ec0a8

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_driven_dev_v2-0.1.0a1.tar.gz:

Publisher: release.yml on GrinRus/ai_driven_dev_v2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_driven_dev_v2-0.1.0a1-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_driven_dev_v2-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 243eb2397dcbfdd80d793c5ba69570e210a2e912f7fdff2a6c989aa80b449383
MD5 5b08cccdaf7b6f7dc9796b7242e26f8d
BLAKE2b-256 6eb3cbda4a40f62466b6b782ee493414479371f19e176b2c0d87940051a90823

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_driven_dev_v2-0.1.0a1-py3-none-any.whl:

Publisher: release.yml on GrinRus/ai_driven_dev_v2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page