Skip to main content

Deterministic validation for AI agent task completion.

Project description

DoneSpec

PyPI version Python versions CI License

Done means deterministically verified.

DoneSpec is a tiny CLI for validating whether an AI coding agent actually completed a task.

It reads a machine-readable done.json, executes deterministic checks, and exits with 0 only when the task is verifiably complete.

pip install donespec
donespec validate done.json
DoneSpec validation: fix-auth-bug

✓ npm tests passed  (1242.7ms)
✓ auth.ts exists  (0.2ms)
✓ returnTo is implemented  (0.5ms)
✗ shared types untouched  (12.1ms)
  Forbidden path modified: src/types.ts

Validation failed.
1 check failed.
Exit code: 1

1. Problem

AI coding agents are increasingly good at producing code, but they still frequently claim work is complete when it is not.

Common failures:

  • tests were not run,
  • tests fail,
  • files were modified incorrectly,
  • requirements were partially implemented,
  • forbidden paths were touched,
  • expected outputs were hallucinated,
  • runtime invariants were broken.

Humans are left reading optimistic summaries instead of deterministic evidence.

DoneSpec gives every task a local, reproducible completion contract.

2. Why AI agents fail

AI agents optimize for plausible task completion. Software delivery requires verified task completion.

An agent can say:

I fixed the auth bug and all tests pass.

DoneSpec asks:

  • Did the configured command exit successfully?
  • Does the expected file exist?
  • Does the required implementation marker exist?
  • Was a forbidden file modified?
  • Does the HTTP endpoint return the expected status?

No vibes. No screenshots. No hidden judgement. Just deterministic checks.

3. What DoneSpec solves

DoneSpec introduces a small validation layer between AI coding agents and human trust.

It provides:

  • a machine-readable done.json,
  • deterministic local checks,
  • CI/CD integration,
  • structured JSON output,
  • a checker registry for future extension,
  • zero LLM dependency in the validation path.

DoneSpec is not an AI wrapper, chatbot, dashboard, or orchestration system.

It is developer infrastructure.

4. Installation

pip

pip install donespec

pipx

pipx install donespec

uv

uv tool install donespec

from source

git clone https://github.com/xryv/DoneSpec.git
cd DoneSpec
python -m pip install -e ".[dev]"

Verify:

donespec --version

initialize a DoneSpec-ready project

donespec init --yes
donespec validate done.json

This creates a starter done.json, agent instructions, VS Code tasks, and optional Git hook files.

Read the full guide:

[docs/init.md

Inspect project readiness

donespec doctor

Use doctor to check whether a repository has the expected DoneSpec files, agent instructions, editor tasks, hooks, and CI integration.

Machine-readable output:

donespec doctor --json

Read the full guide:

docs/doctor.md

](docs/init.md)

Use starter templates

donespec templates
donespec init --template python --yes

DoneSpec includes starter templates for generic, Python, Node.js, documentation, and API projects.

Read the full guide:

docs/templates.md

5. Quick example

Create done.json:

{
  "version": "1.0",
  "task_id": "fix-auth-bug",
  "must_pass": [
    {
      "type": "command",
      "name": "npm tests passed",
      "run": "npm test"
    },
    {
      "type": "file_exists",
      "name": "auth.ts exists",
      "path": "src/auth.ts"
    },
    {
      "type": "regex_in_file",
      "name": "returnTo is implemented",
      "path": "src/auth.ts",
      "pattern": "returnTo"
    }
  ],
  "must_not": [
    {
      "type": "file_not_modified",
      "name": "shared types untouched",
      "path": "src/types.ts"
    }
  ]
}

Run:

donespec validate done.json

Machine-readable output:

donespec validate done.json --json

6. CLI usage

donespec validate done.json

Options:

--json       Emit machine-readable JSON output.
--root PATH  Project root. Defaults to the spec file directory.
--fail-fast  Stop after first failed check.

Exit codes:

Code Meaning
0 Validation passed
1 Validation failed
2 Invalid spec or runtime error

Supported checks

command

Runs a shell command and validates the exit code.

{
  "type": "command",
  "name": "tests pass",
  "run": "pytest",
  "expected_exit_code": 0,
  "timeout_seconds": 120
}

file_exists

Checks that a file or directory exists.

{
  "type": "file_exists",
  "path": "src/auth.ts"
}

regex_in_file

Checks that a regex exists in a file.

{
  "type": "regex_in_file",
  "path": "src/auth.ts",
  "pattern": "returnTo"
}

Optional flags:

{
  "flags": ["IGNORECASE", "MULTILINE", "DOTALL"]
}

regex_absent

Checks that a regex does not exist in a file.

{
  "type": "regex_absent",
  "path": "src/auth.ts",
  "pattern": "console\\.log"
}

file_not_modified

Checks Git status to ensure a file or path was not modified, staged, deleted, or newly added.

{
  "type": "file_not_modified",
  "path": "src/types.ts"
}

http_check

Makes an HTTP request and validates the response status.

{
  "type": "http_check",
  "url": "http://127.0.0.1:8000/health",
  "method": "GET",
  "expected_status": 200,
  "timeout_seconds": 3
}

7. CI integration

GitHub Actions

name: DoneSpec

on:
  pull_request:

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: xryv/DoneSpec@v0.3.0
        with:
          spec: done.json
          root: .

The action installs DoneSpec, runs validation, and fails CI if any check fails.

Direct CI command

- run: python -m pip install donespec
- run: donespec validate done.json

Agent integrations

DoneSpec includes a practical multi-agent workflow for Codex, Claude Code, VS Code tasks, Git hooks, and CI enforcement.

Read the full guide:

docs/agent-integration.md

8. Philosophy

DoneSpec is intentionally boring.

It should feel closer to ESLint, Prettier, pytest, or package.json scripts than to an AI platform.

Principles:

  • local first,
  • deterministic only,
  • minimal dependencies,
  • no accounts,
  • no dashboard in the MVP,
  • no LLM calls in validation,
  • composable with any agent,
  • simple enough for solo developers,
  • strict enough for CI.

AI agents may generate work. DoneSpec verifies completion.

9. Roadmap

DoneSpec is designed for future extension, but the MVP stays small.

Planned directions:

  • more deterministic checkers,
  • generated done.json templates,
  • MCP server integration,
  • AI agent integrations,
  • VSCode extension,
  • cloud dashboard,
  • analytics,
  • multi-agent validation.

Not in the MVP:

  • authentication,
  • database,
  • SaaS dashboard,
  • orchestration system,
  • LLM dependency.

Repository layout

.
├── action.yml
├── done.json
├── done.schema.json
├── docs/
├── examples/
├── pyproject.toml
├── src/
│   └── donespec/
│       ├── cli.py
│       ├── engine.py
│       ├── loader.py
│       ├── models.py
│       ├── output.py
│       ├── schema.py
│       └── checkers/
└── tests/

Development

python -m pip install -e ".[dev]"
ruff check .
ruff format .
pytest
donespec validate done.json

Release

DoneSpec v0.3.0 is available on PyPI:

pip install donespec

GitHub release:

https://github.com/xryv/DoneSpec/releases/tag/v0.3.0

PyPI package:

https://pypi.org/project/donespec/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

donespec-0.3.0.tar.gz (31.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

donespec-0.3.0-py3-none-any.whl (23.7 kB view details)

Uploaded Python 3

File details

Details for the file donespec-0.3.0.tar.gz.

File metadata

  • Download URL: donespec-0.3.0.tar.gz
  • Upload date:
  • Size: 31.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for donespec-0.3.0.tar.gz
Algorithm Hash digest
SHA256 424d761d246ed13324800a036fa4b3fe188c006cc0e1f064e2bad03c12cb0b31
MD5 f7e3623236caa38aac5c5817ba4487b3
BLAKE2b-256 a3e15d30d0f71376c7cd797a039e70c9f933325fb026947822cc86ca27316d62

See more details on using hashes here.

File details

Details for the file donespec-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: donespec-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 23.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for donespec-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ed4f07404b0c3024c8c3c023607af52efcbadf70bd509649db2effdc428d249b
MD5 d16f8a24a91ac88e897e86f0b88296fc
BLAKE2b-256 44b38990c6c2ff60870b161b1704045fdcc2143be2eecd7a7766b5412b10a44c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page