Deterministic validation for AI agent task completion.
Project description
DoneSpec
Done means deterministically verified.
DoneSpec is a tiny CLI for validating whether an AI coding agent actually completed a task.
It reads a machine-readable done.json, executes deterministic checks, and exits with 0 only when the task is verifiably complete.
pip install donespec
donespec validate done.json
DoneSpec validation: fix-auth-bug
✓ npm tests passed (1242.7ms)
✓ auth.ts exists (0.2ms)
✓ returnTo is implemented (0.5ms)
✗ shared types untouched (12.1ms)
Forbidden path modified: src/types.ts
Validation failed.
1 check failed.
Exit code: 1
1. Problem
AI coding agents are increasingly good at producing code, but they still frequently claim work is complete when it is not.
Common failures:
- tests were not run,
- tests fail,
- files were modified incorrectly,
- requirements were partially implemented,
- forbidden paths were touched,
- expected outputs were hallucinated,
- runtime invariants were broken.
Humans are left reading optimistic summaries instead of deterministic evidence.
DoneSpec gives every task a local, reproducible completion contract.
2. Why AI agents fail
AI agents optimize for plausible task completion. Software delivery requires verified task completion.
An agent can say:
I fixed the auth bug and all tests pass.
DoneSpec asks:
- Did the configured command exit successfully?
- Does the expected file exist?
- Does the required implementation marker exist?
- Was a forbidden file modified?
- Does the HTTP endpoint return the expected status?
No vibes. No screenshots. No hidden judgement. Just deterministic checks.
3. What DoneSpec solves
DoneSpec introduces a small validation layer between AI coding agents and human trust.
It provides:
- a machine-readable
done.json, - deterministic local checks,
- CI/CD integration,
- structured JSON output,
- a checker registry for future extension,
- zero LLM dependency in the validation path.
DoneSpec is not an AI wrapper, chatbot, dashboard, or orchestration system.
It is developer infrastructure.
4. Installation
pip
pip install donespec
pipx
pipx install donespec
uv
uv tool install donespec
from source
git clone https://github.com/xryv/DoneSpec.git
cd DoneSpec
python -m pip install -e ".[dev]"
Verify:
donespec --version
initialize a DoneSpec-ready project
donespec init --yes
donespec validate done.json
This creates a starter done.json, agent instructions, VS Code tasks, and optional Git hook files.
Read the full guide:
[docs/init.md
Inspect project readiness
donespec doctor
Use doctor to check whether a repository has the expected DoneSpec files, agent instructions, editor tasks, hooks, and CI integration.
Machine-readable output:
donespec doctor --json
Read the full guide:
docs/doctor.md
](docs/init.md)
5. Quick example
Create done.json:
{
"version": "1.0",
"task_id": "fix-auth-bug",
"must_pass": [
{
"type": "command",
"name": "npm tests passed",
"run": "npm test"
},
{
"type": "file_exists",
"name": "auth.ts exists",
"path": "src/auth.ts"
},
{
"type": "regex_in_file",
"name": "returnTo is implemented",
"path": "src/auth.ts",
"pattern": "returnTo"
}
],
"must_not": [
{
"type": "file_not_modified",
"name": "shared types untouched",
"path": "src/types.ts"
}
]
}
Run:
donespec validate done.json
Machine-readable output:
donespec validate done.json --json
6. CLI usage
donespec validate done.json
Options:
--json Emit machine-readable JSON output.
--root PATH Project root. Defaults to the spec file directory.
--fail-fast Stop after first failed check.
Exit codes:
| Code | Meaning |
|---|---|
| 0 | Validation passed |
| 1 | Validation failed |
| 2 | Invalid spec or runtime error |
Supported checks
command
Runs a shell command and validates the exit code.
{
"type": "command",
"name": "tests pass",
"run": "pytest",
"expected_exit_code": 0,
"timeout_seconds": 120
}
file_exists
Checks that a file or directory exists.
{
"type": "file_exists",
"path": "src/auth.ts"
}
regex_in_file
Checks that a regex exists in a file.
{
"type": "regex_in_file",
"path": "src/auth.ts",
"pattern": "returnTo"
}
Optional flags:
{
"flags": ["IGNORECASE", "MULTILINE", "DOTALL"]
}
regex_absent
Checks that a regex does not exist in a file.
{
"type": "regex_absent",
"path": "src/auth.ts",
"pattern": "console\\.log"
}
file_not_modified
Checks Git status to ensure a file or path was not modified, staged, deleted, or newly added.
{
"type": "file_not_modified",
"path": "src/types.ts"
}
http_check
Makes an HTTP request and validates the response status.
{
"type": "http_check",
"url": "http://127.0.0.1:8000/health",
"method": "GET",
"expected_status": 200,
"timeout_seconds": 3
}
7. CI integration
GitHub Actions
name: DoneSpec
on:
pull_request:
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: xryv/DoneSpec@v0.2.0
with:
spec: done.json
root: .
The action installs DoneSpec, runs validation, and fails CI if any check fails.
Direct CI command
- run: python -m pip install donespec
- run: donespec validate done.json
Agent integrations
DoneSpec includes a practical multi-agent workflow for Codex, Claude Code, VS Code tasks, Git hooks, and CI enforcement.
Read the full guide:
8. Philosophy
DoneSpec is intentionally boring.
It should feel closer to ESLint, Prettier, pytest, or package.json scripts than to an AI platform.
Principles:
- local first,
- deterministic only,
- minimal dependencies,
- no accounts,
- no dashboard in the MVP,
- no LLM calls in validation,
- composable with any agent,
- simple enough for solo developers,
- strict enough for CI.
AI agents may generate work. DoneSpec verifies completion.
9. Roadmap
DoneSpec is designed for future extension, but the MVP stays small.
Planned directions:
- more deterministic checkers,
- generated
done.jsontemplates, - MCP server integration,
- AI agent integrations,
- VSCode extension,
- cloud dashboard,
- analytics,
- multi-agent validation.
Not in the MVP:
- authentication,
- database,
- SaaS dashboard,
- orchestration system,
- LLM dependency.
Repository layout
.
├── action.yml
├── done.json
├── done.schema.json
├── docs/
├── examples/
├── pyproject.toml
├── src/
│ └── donespec/
│ ├── cli.py
│ ├── engine.py
│ ├── loader.py
│ ├── models.py
│ ├── output.py
│ ├── schema.py
│ └── checkers/
└── tests/
Development
python -m pip install -e ".[dev]"
ruff check .
ruff format .
pytest
donespec validate done.json
Release
DoneSpec v0.2.0 is available on PyPI:
pip install donespec
GitHub release:
https://github.com/xryv/DoneSpec/releases/tag/v0.2.0
PyPI package:
https://pypi.org/project/donespec/
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file donespec-0.2.0.tar.gz.
File metadata
- Download URL: donespec-0.2.0.tar.gz
- Upload date:
- Size: 28.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b52e78b4d0aca96c3b8af8c4cb3550d197275df466301853bca7f757bd47f68
|
|
| MD5 |
5d13fb6dba97c928f3dafa79e116f637
|
|
| BLAKE2b-256 |
6fe7f1cfb858298eecedfe221ffab1da0f903c55b3653918edfeedcd26740119
|
File details
Details for the file donespec-0.2.0-py3-none-any.whl.
File metadata
- Download URL: donespec-0.2.0-py3-none-any.whl
- Upload date:
- Size: 22.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
097211d8461ec4b6a89dd2ee52d5630eaa2cd9f548464db021f62517bd6e16a1
|
|
| MD5 |
4aff528a4051f1b0b0a7596a8ba97b62
|
|
| BLAKE2b-256 |
13866f76bdd101de023bca23dbe6155daff0fb1f5fdb480c102e9c15eabb6748
|