Skip to main content

Deterministic validation for AI agent task completion.

Project description

DoneSpec

PyPI version Python versions CI License

Done means deterministically verified.

DoneSpec is a tiny CLI for validating whether an AI coding agent actually completed a task.

It reads a machine-readable done.json, executes deterministic checks, and exits with 0 only when the task is verifiably complete.

pip install donespec
donespec validate done.json
DoneSpec validation: fix-auth-bug

✓ npm tests passed  (1242.7ms)
✓ auth.ts exists  (0.2ms)
✓ returnTo is implemented  (0.5ms)
✗ shared types untouched  (12.1ms)
  Forbidden path modified: src/types.ts

Validation failed.
1 check failed.
Exit code: 1

1. Problem

AI coding agents are increasingly good at producing code, but they still frequently claim work is complete when it is not.

Common failures:

  • tests were not run,
  • tests fail,
  • files were modified incorrectly,
  • requirements were partially implemented,
  • forbidden paths were touched,
  • expected outputs were hallucinated,
  • runtime invariants were broken.

Humans are left reading optimistic summaries instead of deterministic evidence.

DoneSpec gives every task a local, reproducible completion contract.

2. Why AI agents fail

AI agents optimize for plausible task completion. Software delivery requires verified task completion.

An agent can say:

I fixed the auth bug and all tests pass.

DoneSpec asks:

  • Did the configured command exit successfully?
  • Does the expected file exist?
  • Does the required implementation marker exist?
  • Was a forbidden file modified?
  • Does the HTTP endpoint return the expected status?

No vibes. No screenshots. No hidden judgement. Just deterministic checks.

3. What DoneSpec solves

DoneSpec introduces a small validation layer between AI coding agents and human trust.

It provides:

  • a machine-readable done.json,
  • deterministic local checks,
  • CI/CD integration,
  • structured JSON output,
  • a checker registry for future extension,
  • zero LLM dependency in the validation path.

DoneSpec is not an AI wrapper, chatbot, dashboard, or orchestration system.

It is developer infrastructure.

4. Installation

pip

pip install donespec

pipx

pipx install donespec

uv

uv tool install donespec

from source

git clone https://github.com/xryv/DoneSpec.git
cd DoneSpec
python -m pip install -e ".[dev]"

Verify:

donespec --version

5. Quick example

Create done.json:

{
  "version": "1.0",
  "task_id": "fix-auth-bug",
  "must_pass": [
    {
      "type": "command",
      "name": "npm tests passed",
      "run": "npm test"
    },
    {
      "type": "file_exists",
      "name": "auth.ts exists",
      "path": "src/auth.ts"
    },
    {
      "type": "regex_in_file",
      "name": "returnTo is implemented",
      "path": "src/auth.ts",
      "pattern": "returnTo"
    }
  ],
  "must_not": [
    {
      "type": "file_not_modified",
      "name": "shared types untouched",
      "path": "src/types.ts"
    }
  ]
}

Run:

donespec validate done.json

Machine-readable output:

donespec validate done.json --json

6. CLI usage

donespec validate done.json

Options:

--json       Emit machine-readable JSON output.
--root PATH  Project root. Defaults to the spec file directory.
--fail-fast  Stop after first failed check.

Exit codes:

Code Meaning
0 Validation passed
1 Validation failed
2 Invalid spec or runtime error

Supported checks

command

Runs a shell command and validates the exit code.

{
  "type": "command",
  "name": "tests pass",
  "run": "pytest",
  "expected_exit_code": 0,
  "timeout_seconds": 120
}

file_exists

Checks that a file or directory exists.

{
  "type": "file_exists",
  "path": "src/auth.ts"
}

regex_in_file

Checks that a regex exists in a file.

{
  "type": "regex_in_file",
  "path": "src/auth.ts",
  "pattern": "returnTo"
}

Optional flags:

{
  "flags": ["IGNORECASE", "MULTILINE", "DOTALL"]
}

regex_absent

Checks that a regex does not exist in a file.

{
  "type": "regex_absent",
  "path": "src/auth.ts",
  "pattern": "console\\.log"
}

file_not_modified

Checks Git status to ensure a file or path was not modified, staged, deleted, or newly added.

{
  "type": "file_not_modified",
  "path": "src/types.ts"
}

http_check

Makes an HTTP request and validates the response status.

{
  "type": "http_check",
  "url": "http://127.0.0.1:8000/health",
  "method": "GET",
  "expected_status": 200,
  "timeout_seconds": 3
}

7. CI integration

GitHub Actions

name: DoneSpec

on:
  pull_request:

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: xryv/DoneSpec@v0.1.1
        with:
          spec: done.json
          root: .

The action installs DoneSpec, runs validation, and fails CI if any check fails.

Direct CI command

- run: python -m pip install donespec
- run: donespec validate done.json

8. Philosophy

DoneSpec is intentionally boring.

It should feel closer to ESLint, Prettier, pytest, or package.json scripts than to an AI platform.

Principles:

  • local first,
  • deterministic only,
  • minimal dependencies,
  • no accounts,
  • no dashboard in the MVP,
  • no LLM calls in validation,
  • composable with any agent,
  • simple enough for solo developers,
  • strict enough for CI.

AI agents may generate work. DoneSpec verifies completion.

9. Roadmap

DoneSpec is designed for future extension, but the MVP stays small.

Planned directions:

  • more deterministic checkers,
  • generated done.json templates,
  • MCP server integration,
  • AI agent integrations,
  • VSCode extension,
  • cloud dashboard,
  • analytics,
  • multi-agent validation.

Not in the MVP:

  • authentication,
  • database,
  • SaaS dashboard,
  • orchestration system,
  • LLM dependency.

Repository layout

.
├── action.yml
├── done.json
├── done.schema.json
├── docs/
├── examples/
├── pyproject.toml
├── src/
│   └── donespec/
│       ├── cli.py
│       ├── engine.py
│       ├── loader.py
│       ├── models.py
│       ├── output.py
│       ├── schema.py
│       └── checkers/
└── tests/

Development

python -m pip install -e ".[dev]"
ruff check .
ruff format .
pytest
donespec validate done.json

Release

DoneSpec v0.1.1 is available on PyPI:

pip install donespec

GitHub release:

https://github.com/xryv/DoneSpec/releases/tag/v0.1.1

PyPI package:

https://pypi.org/project/donespec/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

donespec-0.1.1.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

donespec-0.1.1-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file donespec-0.1.1.tar.gz.

File metadata

  • Download URL: donespec-0.1.1.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for donespec-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7660146c06d849cac8abc7f2fdb94347ac995aa39943ec4c40c88da85707b817
MD5 3c583629206cc74323e30cb8a4f515f9
BLAKE2b-256 2b2ad9750c6305762931c8f5353203ce78bf10dda1a44c4876c6449df4be1575

See more details on using hashes here.

File details

Details for the file donespec-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: donespec-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 17.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for donespec-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 07cb4de26a699859cfdfbc42d3bba54c774316ee0b1f4c36b538a03921e5e467
MD5 9dd8eacc9314c25d124e1f4886578db0
BLAKE2b-256 f2901727bba11ed9ca5bd8ea88d3edfde765d483cf8dd937344f036ee0db423a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page