A lightweight scaffold for evaluating agentic tasks with structured checks.
Project description
Agent Harness CLI
agent-harness-cli is a thin, dependency-free CLI for agentic task checks. It does
not own domain logic. It runs user-defined check scripts, stores report JSON,
and lets agents page through reports without dumping everything at once.
Install:
uv tool install agent-harness-cli
The core shape is:
Task spec + external check commands + report store + paginated viewer
Quick Start
Run checks for a task file from any workspace:
agent-harness run-checks --task task.json --report-id sample-report
The command prints a compact summary:
PASSED 2/2 checks
report_id: sample-report
report_path: reports/sample-report.json
Next:
agent-harness view sample-report
agent-harness view sample-report --failed-only
View a report one page at a time:
agent-harness view sample-report --page 1 --page-size 5
agent-harness view sample-report --failed-only
Run tests:
uv run python -m unittest discover -s tests -p "test_*.py"
Build package distributions:
uv build
Project Layout
src/agent_harness_cli/
runners/ Thin CLI implementations for run-checks and view.
skills/ Skill that teaches agents how to design check scripts.
schemas/ JSON schemas for tasks, check results, and reports.
tests/ Self-contained CLI tests.
Check Command Contract
Each task check declares a command:
{
"name": "todo_markers",
"command": ["{python}", "checks/todo_markers.py"],
"severity": "warning",
"config": {
"patterns": ["TODO", "TBD"]
}
}
The harness writes an input JSON file and appends --input <path> unless the
command already contains {input}. It also replaces {python} with the current
Python interpreter.
The input contains:
{
"root": "project root provided by harness",
"task_path": "task.json",
"task": {},
"check": {}
}
Check Result Contract
Every check returns this shape:
{
"check": "required_artifacts",
"passed": true,
"score": 1.0,
"severity": "error",
"summary": "All required artifacts exist.",
"reasons": []
}
Failed checks should include specific reasons with evidence and a suggested fix.
Design Notes
- The PyPI distribution is
agent-harness-cli; the installed command isagent-harness. - Deterministic checks should be preferred over LLM judges.
- LLM judge checks can import
agent_harness_cli.llm.codex_judge, which calls localcodex execand supports checklist-based judging. - Warnings guide an agent without blocking the run.
- Error-level failures block the run.
- Domain logic belongs in user-owned check scripts.
- Use
skills/harness-check-designer/SKILL.mdwhen asking an agent to design a new check. - JSON is used for task and report files to avoid parser dependencies.
Publishing
The GitHub workflow at .github/workflows/publish.yml publishes on tags that
match v*.*.*. The tag version must match [project].version without the
leading v.
git tag v0.1.0
git push origin v0.1.0
Publishing uses PyPI Trusted Publishing with the pypi GitHub environment.
Configure the PyPI project agent-harness-cli to trust this repository and the
workflow file .github/workflows/publish.yml before pushing a release tag.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_harness_cli-0.1.0.tar.gz.
File metadata
- Download URL: agent_harness_cli-0.1.0.tar.gz
- Upload date:
- Size: 13.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c16bd5a9de9e736f0891c26f3d0baca801cbf4e4d68876b1500fa6ec5aa54d2
|
|
| MD5 |
f9373366d366d2582cf12b4495a56131
|
|
| BLAKE2b-256 |
f15b826b321ff0afdbcc425c4ecab113512368ecf8495ca217b111371e3961f9
|
Provenance
The following attestation bundles were made for agent_harness_cli-0.1.0.tar.gz:
Publisher:
publish.yml on Biaoo/agent-harness-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_harness_cli-0.1.0.tar.gz -
Subject digest:
1c16bd5a9de9e736f0891c26f3d0baca801cbf4e4d68876b1500fa6ec5aa54d2 - Sigstore transparency entry: 1380489875
- Sigstore integration time:
-
Permalink:
Biaoo/agent-harness-cli@4502ab83f1bbcbb720984a3650f11eaf9cc4805c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Biaoo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4502ab83f1bbcbb720984a3650f11eaf9cc4805c -
Trigger Event:
push
-
Statement type:
File details
Details for the file agent_harness_cli-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agent_harness_cli-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6451c6171a2be9f5ab8fcca146086fe7e9aaf6d559c69f179d681157722c382f
|
|
| MD5 |
e2fb8f45f954361d7630cf1b6916dfc1
|
|
| BLAKE2b-256 |
349e317e2b6de93794d26c3fc86bd24092d7d4a5c7e3e99a4b198dfb9015b038
|
Provenance
The following attestation bundles were made for agent_harness_cli-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on Biaoo/agent-harness-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_harness_cli-0.1.0-py3-none-any.whl -
Subject digest:
6451c6171a2be9f5ab8fcca146086fe7e9aaf6d559c69f179d681157722c382f - Sigstore transparency entry: 1380489971
- Sigstore integration time:
-
Permalink:
Biaoo/agent-harness-cli@4502ab83f1bbcbb720984a3650f11eaf9cc4805c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Biaoo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4502ab83f1bbcbb720984a3650f11eaf9cc4805c -
Trigger Event:
push
-
Statement type: