Deterministic quality gates for AI-assisted development
Project description
๐ฆ Agent Harness
Enforce. Enforce. Enforce.
Stop the slop. AI agents aren't sloppy โ the human toolchains they inherited are.
44 rules ยท 5 stacks ยท <500ms ยท Zero config
Quick Start ยท The Problem ยท Stacks ยท For AI Agents ยท Contributing
PyPI: Install as
agentic-harnessโ theagent-harnessname is reserved by an unrelated abandoned package (transfer pending). CLI command isagent-harness.
Agent: writes Dockerfile, commits
โ
agent-harness lint โ FAIL: no USER directive, no HEALTHCHECK, COPY . . before pip install
โ
Agent: reads errors, fixes all three, re-lints
โ
agent-harness lint โ 10 passed, 0 failed (476ms)
โ
Agent: commits clean code. No human involved.
Dockerfiles without USER directives. Compose files without healthchecks. Secrets hardcoded in ENV. Dependency caches busted on every build. Coverage gates that don't exist. Formatters that never run.
This isn't the AI's fault. These are human-built toolchains with decades of accumulated slop โ implicit defaults, silent failures, missing guardrails. Humans learned to work around them through tribal knowledge. AI agents don't have tribal knowledge. They just hit the wall.
Agent Harness is the wall that talks back. One CLI, deterministic feedback, actionable error messages. Every rule exists because an AI agent made that exact mistake โ and will keep making it until something stops it.
$ agent-harness lint
PASS conftest-gitignore (39ms)
PASS conftest-json (0ms)
PASS yamllint (117ms)
PASS file-length (0ms)
PASS ruff:format (50ms)
PASS ruff:check (118ms)
PASS ty (109ms)
PASS conftest-python (43ms)
8 passed, 0 failed (476ms)
The Problem
AI agents are as good as the feedback they get. Human toolchains give terrible feedback โ or none at all.
| The slop | What the agent does | What the harness does |
|---|---|---|
.gitignore is "optional" |
Commits .env with real secrets |
Policy catches it before commit |
| Dockerfile layer order is tribal knowledge | COPY . . before pip install โ 5min rebuilds |
Layer ordering policy enforces correct order |
pytest.mark.untit silently selects nothing |
Thinks tests pass (zero ran) | Strict markers policy catches the typo |
| Compose healthcheck is "recommended" | Deploy "succeeds," service is dead | Healthcheck policy fails the lint |
| Formatters exist but nobody runs them | Reformats differently each iteration | Formatter runs on every commit, enforcing consistency |
An agent can't act on "consider using healthchecks." It can act on "FAIL: services.api missing healthcheck โ add healthcheck: block."
That's the difference between documentation and a harness. Documentation hopes. A harness enforces.
We'll build AI-first frameworks eventually. Until then, agents have to work with what humans built. Agent Harness makes that survivable.
Quick Start
# Install
uv tool install agentic-harness # or: pip install agentic-harness
# Detect stacks + subprojects
agent-harness detect
# Set up configs and Makefile
agent-harness init
# Run all checks
agent-harness lint
# Auto-fix what's fixable, then lint
agent-harness fix
Stacks
Agent Harness auto-detects your project and activates the right checks. Zero config required.
Python
Detected by pyproject.toml, setup.py, requirements.txt
| Tool | What it checks |
|---|---|
| ruff | Linting + formatting (fastest Python linter) |
| ty | Type checking |
| conftest | pytest strict-markers, coverage >=90%, verbose output, ruff config |
| file-length | No file exceeds 500 lines |
JavaScript / TypeScript
Detected by package.json, tsconfig.json
| Tool | What it checks |
|---|---|
| Biome | Linting + formatting (single Rust-based tool, ~20x faster than ESLint) |
| Framework type checker | astro check, next lint, or tsc --noEmit โ auto-detected |
| conftest | engines field, type: "module", no wildcard * versions |
Docker
Detected by Dockerfile, docker-compose*.yml
| Tool | What it checks |
|---|---|
| hadolint | Dockerfile best practices (DL/SC rules) |
| conftest | Layer ordering, cache mounts, USER directive, HEALTHCHECK, secrets in ENV/ARG, base image pinning (discovers all Dockerfiles in tree) |
| conftest (compose) | Healthchecks, restart policies, image pinning, port binding, $$ escaping, no bind mounts, no inline configs |
Dokploy
Detected by dokploy-network reference in compose files
| Tool | What it checks |
|---|---|
| conftest | traefik.enable=true on labeled services, dokploy-network for routed services |
Universal
Always active on every project.
| Tool | What it checks |
|---|---|
| yamllint | YAML syntax, duplicate keys, truthy values |
| conftest | .gitignore completeness (stack-aware), JSON validity |
| file-length | Extension-aware: .py/.ts 500 lines, .astro/.vue 800 lines |
Configuration
Zero config by default โ stacks are auto-detected. Override with .agent-harness.yml:
stacks:
- python
- docker
- javascript
exclude:
- _archive/
- vendor/
python:
coverage_threshold: 95
line_length: 140
max_file_lines: 500
javascript:
coverage_threshold: 80
docker:
own_image_prefix: "ghcr.io/myorg/"
Conftest Exceptions
Skip individual policies per file when legitimate:
docker:
conftest_skip:
scripts/autonomy/Dockerfile:
- dockerfile.user # runs as root intentionally
- dockerfile.healthcheck # not a service
See SKILL.md for the full list of exception IDs.
Commands
| Command | Description |
|---|---|
agent-harness detect |
Show detected stacks and subprojects |
agent-harness init |
Scaffold configs, Makefile, show tool availability |
agent-harness init --apply |
Apply auto-fixes and create missing config files |
agent-harness lint |
Run all checks โ exits non-zero on failure |
agent-harness fix |
Auto-fix (ruff, biome), then lint |
agent-harness security-audit |
Scan working dir for vulnerable deps + leaked secrets |
agent-harness security-audit-history |
Deep scan full git history for leaked secrets |
For AI Agents
The feedback loop
Agent writes code
โ
agent-harness lint
โ
โโ PASS โ commit
โโ FAIL โ agent reads error โ agent fixes โ re-lint
Every error message is actionable. Every Rego policy has a structured comment:
# WHAT: What this rule checks
# WHY: Why it matters for AI agents
# WITHOUT IT: What breaks in practice
# FIX: How to resolve the violation
When a user challenges a rule
- Read the WHY block from the
.regofile - Explain the risk to the user
- If they still want to suppress โ that's their call
The WHY exists because agents make these specific mistakes. It's the agent's argument.
Claude Code plugin
Agent Harness ships as a Claude Code plugin with guidance docs:
# Load as plugin
claude --plugin-dir /path/to/agent-harness
# Or add to your shell alias
alias c="claude --plugin-dir ~/path/to/agent-harness"
The plugin includes:
- Skill โ when to use, workflow, stack reference
- Docker guidance โ healthcheck recipes, migration patterns, config strategies
- Python guidance โ why each pyproject.toml knob matters
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Framework (Django, FastAPI, Next.js)โ โ future
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Stack (Python, JS/TS, Go) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Infrastructure (Docker, Dokploy) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Universal โ โ always active
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Each layer composes on top of the previous. Adding a new stack = creating a new directory. Each check is a self-contained file with its own docstring, test, and single responsibility.
Tool Stack
Agent Harness orchestrates external tools โ it doesn't embed them:
| Tool | Purpose | Fallback |
|---|---|---|
| conftest | Rego policy engine | Required |
| hadolint | Dockerfile linting | Required for Docker |
| ruff | Python linting + formatting | uv run fallback |
| ty | Python type checking | uv run fallback |
| Biome | JS/TS linting + formatting | npx fallback |
| yamllint | YAML validation | uv run fallback |
Requirements
- Python 3.12+
- conftest (required)
- hadolint (for Docker projects)
- Other tools auto-fallback via
uv runornpx
Status
Actively developed. See PLANS.md for roadmap.
Current: 44 Rego deny rules, 5 stacks (Python, JavaScript, Docker, Dokploy, Universal), 201 Python tests, 109 Rego tests.
Contributing
See CONTRIBUTING.md.
Every Rego policy follows the WHAT/WHY/WITHOUT IT/FIX pattern. Every Python check has a self-documenting docstring. Adding a rule? Write the WHY first โ if you can't articulate why an AI agent needs this specific check, it doesn't belong here.
License
Apache 2.0 โ see LICENSE.
๐ฆ Cold-blooded enforcement since mid 2025.
Built by Denis Tomilin at Yose Labs
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentic_harness-0.3.3.tar.gz.
File metadata
- Download URL: agentic_harness-0.3.3.tar.gz
- Upload date:
- Size: 189.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
835bf139067c99210921a97768a04aee0a281872809829eb504f760d83aeb36e
|
|
| MD5 |
8945043522c6089ed1518439c8ff307b
|
|
| BLAKE2b-256 |
3213d396e01028ae9cb34851d47028a59c08fa8a79fcf916bf953378639ef72e
|
Provenance
The following attestation bundles were made for agentic_harness-0.3.3.tar.gz:
Publisher:
publish.yml on yoselabs/agent-harness
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentic_harness-0.3.3.tar.gz -
Subject digest:
835bf139067c99210921a97768a04aee0a281872809829eb504f760d83aeb36e - Sigstore transparency entry: 1282528409
- Sigstore integration time:
-
Permalink:
yoselabs/agent-harness@c1980be26b468aff5ac037c6667b74b621e69381 -
Branch / Tag:
refs/tags/v0.3.3 - Owner: https://github.com/yoselabs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c1980be26b468aff5ac037c6667b74b621e69381 -
Trigger Event:
push
-
Statement type:
File details
Details for the file agentic_harness-0.3.3-py3-none-any.whl.
File metadata
- Download URL: agentic_harness-0.3.3-py3-none-any.whl
- Upload date:
- Size: 105.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffb8f009151cb1989d0d0fcb1630ad58672dc541de0ae6d72cde9535c4c9814d
|
|
| MD5 |
3d20a744d68422aabd02f1732521adcd
|
|
| BLAKE2b-256 |
b76f51a60e848600aa88ad5cd001bef0a37b0c24dc3953fa588fe91257984dd1
|
Provenance
The following attestation bundles were made for agentic_harness-0.3.3-py3-none-any.whl:
Publisher:
publish.yml on yoselabs/agent-harness
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentic_harness-0.3.3-py3-none-any.whl -
Subject digest:
ffb8f009151cb1989d0d0fcb1630ad58672dc541de0ae6d72cde9535c4c9814d - Sigstore transparency entry: 1282528428
- Sigstore integration time:
-
Permalink:
yoselabs/agent-harness@c1980be26b468aff5ac037c6667b74b621e69381 -
Branch / Tag:
refs/tags/v0.3.3 - Owner: https://github.com/yoselabs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c1980be26b468aff5ac037c6667b74b621e69381 -
Trigger Event:
push
-
Statement type: