Skip to main content

Deterministic quality gates for AI-assisted development

Project description

Agent Harness

Deterministic quality gates for AI-assisted development

44 rules · 5 stacks · <500ms · Zero config

PyPI CI

Quick Start · Stacks · For AI Agents · Contributing

Note: The PyPI package is agentic-harness (the agent-harness name is taken by an unrelated abandoned project — transfer pending). The CLI command is still agent-harness.


AI agents generate code in loops. Each iteration can introduce misconfigurations, missing healthchecks, bad Dockerfile layering, or secrets in git — and the agent won't know unless something tells it.

Agent Harness is that something. One CLI that detects your project stack, runs all quality checks, and gives agents actionable error messages they can fix without human intervention.

$ agent-harness lint

  PASS  conftest-gitignore (39ms)
  PASS  conftest-json (0ms)
  PASS  yamllint (117ms)
  PASS  file-length (0ms)
  PASS  ruff:format (50ms)
  PASS  ruff:check (118ms)
  PASS  ty (109ms)
  PASS  conftest-python (43ms)

8 passed, 0 failed (476ms)

Why

Agents need tight feedback loops. The tighter and more deterministic the feedback, the more effective the agent.

Without harness With harness
Agent commits .env with real secrets .gitignore policy catches it before commit
Dockerfile rebuilds all deps every push Layer ordering policy enforces correct COPY order
pytest.mark.untit silently selects nothing Strict markers policy catches the typo
Compose healthcheck missing, deploy "succeeds" Healthcheck policy fails the lint
Agent reformats code differently each iteration Formatter runs on every check, enforcing consistency

An agent can't act on "consider using healthchecks." It can act on "FAIL: services.api missing healthcheck — add healthcheck: block." That's the difference between documentation and a harness.

Quick Start

# Install
uv tool install agentic-harness   # or: pip install agentic-harness

# Detect stacks + subprojects
agent-harness detect

# Set up configs and Makefile
agent-harness init

# Run all checks
agent-harness lint

# Auto-fix what's fixable, then lint
agent-harness fix

Stacks

Agent Harness auto-detects your project and activates the right checks. Zero config required.

Python

Detected by pyproject.toml, setup.py, requirements.txt

Tool What it checks
ruff Linting + formatting (fastest Python linter)
ty Type checking
conftest pytest strict-markers, coverage >=90%, verbose output, ruff config
file-length No file exceeds 500 lines

JavaScript / TypeScript

Detected by package.json, tsconfig.json

Tool What it checks
Biome Linting + formatting (single Rust-based tool, ~20x faster than ESLint)
Framework type checker astro check, next lint, or tsc --noEmit — auto-detected
conftest engines field, type: "module", no wildcard * versions

Docker

Detected by Dockerfile, docker-compose*.yml

Tool What it checks
hadolint Dockerfile best practices (DL/SC rules)
conftest Layer ordering, cache mounts, USER directive, HEALTHCHECK, secrets in ENV/ARG, base image pinning (discovers all Dockerfiles in tree)
conftest (compose) Healthchecks, restart policies, image pinning, port binding, $$ escaping, no bind mounts, no inline configs

Dokploy

Detected by dokploy-network reference in compose files

Tool What it checks
conftest traefik.enable=true on labeled services, dokploy-network for routed services

Universal

Always active on every project.

Tool What it checks
yamllint YAML syntax, duplicate keys, truthy values
conftest .gitignore completeness (stack-aware), JSON validity
file-length Extension-aware: .py/.ts 500 lines, .astro/.vue 800 lines

Configuration

Zero config by default — stacks are auto-detected. Override with .agent-harness.yml:

stacks:
  - python
  - docker
  - javascript

exclude:
  - _archive/
  - vendor/

python:
  coverage_threshold: 95
  line_length: 140
  max_file_lines: 500

javascript:
  coverage_threshold: 80

docker:
  own_image_prefix: "ghcr.io/myorg/"

Conftest Exceptions

Skip individual policies per file when legitimate:

docker:
  conftest_skip:
    scripts/autonomy/Dockerfile:
      - dockerfile.user        # runs as root intentionally
      - dockerfile.healthcheck # not a service

See SKILL.md for the full list of exception IDs.

Commands

Command Description
agent-harness detect Show detected stacks and subprojects
agent-harness init Scaffold configs, Makefile, show tool availability
agent-harness init --apply Apply auto-fixes and create missing config files
agent-harness lint Run all checks — exits non-zero on failure
agent-harness fix Auto-fix (ruff, biome), then lint
agent-harness security-audit Scan working dir for vulnerable deps + leaked secrets
agent-harness security-audit-history Deep scan full git history for leaked secrets

For AI Agents

The feedback loop

Agent writes code
       ↓
agent-harness lint
       ↓
  ┌─ PASS → commit
  └─ FAIL → agent reads error → agent fixes → re-lint

Every error message is actionable. Every Rego policy has a structured comment:

# WHAT: What this rule checks
# WHY: Why it matters for AI agents
# WITHOUT IT: What breaks in practice
# FIX: How to resolve the violation

When a user challenges a rule

  1. Read the WHY block from the .rego file
  2. Explain the risk to the user
  3. If they still want to suppress — that's their call

The WHY exists because agents make these specific mistakes. It's the agent's argument.

Claude Code plugin

Agent Harness ships as a Claude Code plugin with guidance docs:

# Load as plugin
claude --plugin-dir /path/to/agent-harness

# Or add to your shell alias
alias c="claude --plugin-dir ~/path/to/agent-harness"

The plugin includes:

  • Skill — when to use, workflow, stack reference
  • Docker guidance — healthcheck recipes, migration patterns, config strategies
  • Python guidance — why each pyproject.toml knob matters

Architecture

┌─────────────────────────────────────┐
│  Framework (Django, FastAPI, Next.js)│  ← future
├─────────────────────────────────────┤
│  Stack (Python, JS/TS, Go)          │
├─────────────────────────────────────┤
│  Infrastructure (Docker, Dokploy)   │
├─────────────────────────────────────┤
│  Universal                          │  ← always active
└─────────────────────────────────────┘

Each layer composes on top of the previous. Adding a new stack = creating a new directory. Each check is a self-contained file with its own docstring, test, and single responsibility.

Tool Stack

Agent Harness orchestrates external tools — it doesn't embed them:

Tool Purpose Fallback
conftest Rego policy engine Required
hadolint Dockerfile linting Required for Docker
ruff Python linting + formatting uv run fallback
ty Python type checking uv run fallback
Biome JS/TS linting + formatting npx fallback
yamllint YAML validation uv run fallback

Requirements

  • Python 3.12+
  • conftest (required)
  • hadolint (for Docker projects)
  • Other tools auto-fallback via uv run or npx

Status

Actively developed. See PLANS.md for roadmap.

Current: 44 Rego deny rules, 5 stacks (Python, JavaScript, Docker, Dokploy, Universal), 201 Python tests, 109 Rego tests.

Contributing

See CONTRIBUTING.md.

Every Rego policy follows the WHAT/WHY/WITHOUT IT/FIX pattern. Every Python check has a self-documenting docstring. Adding a rule? Write the WHY first — if you can't articulate why an AI agent needs this specific check, it doesn't belong here.

License

MIT — see LICENSE.


Built by Denis Tomilin at Agentic Engineering

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentic_harness-0.2.0.tar.gz (184.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentic_harness-0.2.0-py3-none-any.whl (101.3 kB view details)

Uploaded Python 3

File details

Details for the file agentic_harness-0.2.0.tar.gz.

File metadata

  • Download URL: agentic_harness-0.2.0.tar.gz
  • Upload date:
  • Size: 184.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentic_harness-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2eee4b841a6bd87a9cead042a0959cd12466df211299d048811f148c17994cae
MD5 0edc41d43d879c11d652fc887f1bdcff
BLAKE2b-256 ad912ef371f56a0ef3ed35b0447b0dc8fcb053a7a1f2580b0b90b6e200258d22

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_harness-0.2.0.tar.gz:

Publisher: publish.yml on agentic-eng/agent-harness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agentic_harness-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: agentic_harness-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 101.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentic_harness-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 10ebf9458ec6453b38c09f5085504239e0cf6b23507c242b4a0a2f2a8d367ad5
MD5 19e640fe95a431e4e84b6e00b9cb0a6e
BLAKE2b-256 fd3df609ea1537f7e3be039600b91a69176c337237b69beca493cdb84868c189

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_harness-0.2.0-py3-none-any.whl:

Publisher: publish.yml on agentic-eng/agent-harness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page