A runtime governance layer that enforces hard behavioral bounds in autonomous agents.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Sarthaksahu777

These details have not been verified by PyPI

Project description

🛡️ Agent-Harness

A Rigorous Engineering Kernel for Bounded Autonomous Agents

A deterministic runtime governance engine that enforces bounded, auditable execution for autonomous AI agents.

📑 Table of Contents

Overview & Motivation · Determinism vs. Industry · How It's Useful
Deployment & Quick Start · Architecture · System Behavior
Core Components · Governance Checklist · Failure Semantics
Integrations · Performance · Security & Threat Model
Testing & Validation

❓ What is Agent-Harness?

Agent-Harness is a deterministic governance runtime that acts as a non-bypassable Runtime Execution Firewall between an agent's reasoning loop and real-world execution surfaces. It translates environmental feedback (risk, effort, stagnation) into strict mathematical execution budgets. When an agent exceeds its bounds, the Harness forcibly terminates execution.

🛑 Motivation / Problem Statement

Autonomous agents rely on "soft" boundaries (prompt engineering, RLHF). Under sustained failure or adversarial conditions, agents ignore these and enter unbounded loops. Agent-Harness replaces soft boundaries with a runtime execution firewall that dynamically tracks Effort, Exploration, and Risk. Once budgets deplete, permission to execute always collapses.

💎 Why Agent-Harness?

Most AI safety solutions are probabilistic (LLM validation) or stateless (regex). Agent-Harness introduces Stateful Determinism—enforcing physical runtime bounds that an agent cannot reason its way out of.

⚖️ With vs. Without Agent-Harness

Feature / Risk	Without Agent-Harness	With Agent-Harness
Runaway Loops	Infinite token burn & cost	Deterministically halted via `STAGNATION`
Prompt Injection	Agent goal hijacking	Blocked via `GuardrailStack`
Malicious Code	Direct `os.system` execution	Intercepted & session terminated
Cost Control	Manual monitoring	Hard-capped `effort` & `risk` budgets
Auditability	Opaque/Manual logs	SHA256 hash-chained JSONL traces
Safety Logic	Hardcoded in prompts (soft)	Enforced by external kernel (hard)

🏢 How Agent-Harness Compares

Capability	Agent-Harness	NeMo Guardrails	Llama Guard	Guardrails AI	LangChain Limits
Stops a runaway agent mid-loop	✅	❌	❌	❌	⚠️ `max_iterations` only
Agent cannot bypass it	✅ External sidecar	⚠️ In-app	⚠️ Wrapper	❌ In-process	❌ In-framework
Detects stagnation (busy ≠ productive)	✅ Signal-based	❌	❌	❌	❌
Blocks dangerous tool calls before execution	✅ Guardrail stack	⚠️ Dialog-level	❌ Text only	✅ Validators	❌
Works without an LLM in the safety path	✅ Pure math	❌ Uses LLM	❌ Is an LLM	⚠️ Some regex	✅
Cryptographic audit trail	✅ SHA256 chain	❌	❌	❌	❌
Multi-agent budget coordination	✅ SharedBudgetPool	❌	❌	❌	❌
Same input → same decision, always	✅ Deterministic	❌ Probabilistic	❌ Probabilistic	⚠️ Heuristic	✅ Static
Works across any LLM vendor	✅	⚠️ NVIDIA stack	⚠️ Meta models	✅	⚠️ LangChain only

Key difference: Other systems ask "is this output safe?" — Agent-Harness asks "should this agent still be allowed to act?"

🔧 How Agent-Harness Is Useful

1. Prevent Runaway Loops — Agents frequently retry failing tools indefinitely. Agent-Harness detects stagnation (zero reward across steps) and halts execution deterministically via FailureType.STAGNATION.

2. Bound LLM API Costs — Effort budgets drain faster when the agent fails to produce real environment progress, preventing token burn from "reasoning theater" (verbose monologue with zero action).

3. Secure Tool Execution — Every tool call passes through GuardrailStack before execution, blocking dangerous code patterns (os.system(), exec(), PII leakage). The dangerous call never reaches the execution surface.

4. Multi-Agent Stability — SharedBudgetPool and CascadeDetector enforce global budget limits across swarms, preventing cascading agent spawn failures.

5. Compliance Audit Trails — SHA256 hash-chained JSONL logs provide tamper-evident, non-repudiable decision history for regulated environments (finance, healthcare, enterprise).

Agent-Harness uses deterministic budget math to guarantee bounded execution without LLM inference.

🚀 Deployment & Installation

Local Engine Setup

Install the lightweight engine via pip to integrate governance into your Python logic:

# Install via pip
pip install agentharnessengine

# Or install from source
git clone https://github.com/Sarthaksahu777/Agent-Harness
cd Agent-Harness
pip install -e .

Docker Proxy (Production Ready)

Deploy a non-root, multi-stage optimized governance proxy in seconds. This is the recommended method for production environments to ensure non-bypassable enforcement:

# Start the Governance Proxy + Metrics Stack
docker-compose -f deployment/docker-compose.yml up -d

# Check health
curl http://localhost:8080/health

The proxy exposes port 8080 (Production) and 8081 (Dev/Hot-reload).

Quick Start (Engine Only)

A minimal, framework-free implementation of the engine checking a basic step progression:

from governance.kernel import GovernanceKernel
from governance.profiles import BALANCED

# 1. Initialize the Kernel with a standard behavioral profile
kernel = GovernanceKernel(profile=BALANCED)

# 2. Simulate environmental signals on an agent step
result = kernel.step(reward=0.6, novelty=0.1, urgency=0.0)

if result.halted:
    print(f"TERMINATED: {result.failure}")
else:
    print(f"ALLOWED. Remaining Effort: {result.budget.effort:.2f}")

🏗️ Architecture

Agent-Harness intercepts execution across three distinct, modular layers. It strictly separates the "Mind" (Agent Reasoning) from the "Body" (Tool Execution).

LEVEL 1: High-Level System Model

Agent (Mind): The LLM controller generating tool inputs and plans. It never touches APIs directly.
Agent-Harness (Governance Runtime): The non-bypassable firewall. It translates abstract environmental signals into concrete behavioral budgets and blocks non-compliant behavior.
Execution Surface (Tools/APIs): The actual code functions, network requests, or database queries the agent wishes to execute.

LEVEL 2: Governance Pipeline

Every tool call goes through a deterministic processing flow before execution:

Agent outputs a tool request.
Proxy Enforcer intercepts the request and normalizes it.
Guardrail Stack scans the payload for malicious logic or PII.
Signal Evaluator continuously processes external feedback (reward, difficulty).
Governance Kernel translates signals into deterministic behavioral budgets.
Budget Decision fuses the state and issues a hard ALLOW or HALT.
Execute / Halt: The tool is executed, or a 403-Forbidden block terminates the session.

LEVEL 3: Budget Decision Loop

The internal kernel state update algorithm ensures that no action is taken without a fresh budget calculation.

Example System Behavior

Here is how the Agent-Harness reacts to standard scenarios:

1. Successful Step (Allowed)

[KERNEL] Action: query_database (args: search="Q3 revenue")
[KERNEL] Signals: Reward=0.8, Novelty=0.1, Urgency=0.0
[KERNEL] Status: ALLOWED. 
[KERNEL] Remaining Effort: 0.98. Executing tool...

2. Blocked Tool Call (Guardrail Triggered)

[KERNEL] Action: execute_python (args: code="os.system('cat /etc/passwd')")
[GUARDRAIL] Triggered: code_execution 
[GUARDRAIL] Reason: Potentially dangerous code pattern detected (matched: \bos\.system\s*\()
[KERNEL] Status: HALTED. 
[KERNEL] Result: FailureType.SAFETY (Guardrail violation). Block 403.

3. Halted Agent (Budget Exhausted / Stagnation)

[KERNEL] Action: search_web (args: query="retry 104")
[KERNEL] Signals: Reward=0.0, Difficulty=1.0, Urgency=0.4
[KERNEL] ⚠️ Effort depleted to 0.15.
[KERNEL] Status: HALTED. 
[KERNEL] Result: FailureType.EXHAUSTION (effort_exhausted). Session terminated.

📦 Core Components

Based on the src/governance/ module structure, the repository is composed of several critical files:

kernel.py (GovernanceKernel): The central state machine. Deterministic orchestrator that receives signals, evaluates budgets, and enforces terminal halt conditions based on progress accumulation.
guardrails.py (GuardrailStack): Pluggable security detectors that intercept and block dangerous payloads.
audit.py (HashChainedAuditLogger): Records a cryptographically immutable JSONL log of every system decision.
evaluation.py (SignalEvaluator): Normalizes external semantic signals before integrating them into the core control state.
profiles.py & behavior.py: Defines agent temperament patterns (e.g., BALANCED, CONSERVATIVE).

✅ The 15-Point AI Governance Checklist

Agent-Harness natively implements the 15-Point AI Governance Checklist via deterministic runtime execution mechanisms:

Governance Rule	Enforcement Mechanism
1. Unbounded Behavior (Prevent infinite loops)	Finite `effort` and `persistence` budgets deplete with action. Zero budget = Terminal Halt.
2. Runtime Control (Intervene during execution)	Dynamic `step()` evaluates signals at runtime (Hz) and updates control state immediately.
3. Deterministic Behavior (Same inputs → Same decision)	State transitions use hardcoded, versioned matrices. No random seeds exist in the Kernel.
4. Explainable Halting (Explicit halt reasons)	Halts return explicit `FailureType` flags (e.g., `OVERRISK`, `STAGNATION`) and precise string reasons.
5. Fail-Closed Semantics (Default to block)	Once halted, state freezes permanently. Proxy middleware returns 403 on any error.
6. Physical Enforcement (Physically block actions)	`ProxyEnforcer` intercepts all tool calls at the network level, forcing halts.
7. Auditability (Log every decision)	Hash-chained SHA256-linked entries in append-only JSONL files. CLI verification utility provided.
8. Accountability (Who authorized this?)	Multi-agent coordination mechanisms track `agent_id` and parentage for every logged action.
9. Risk Containment (Bound risky actions)	Dedicated `risk` budget accumulator. Hard caps trigger an immediate terminal halt.
10. Progress Discrimination (Busy ≠ Productive)	Stagnation detection windows identify and halt cycles of low-reward, busy-work activity.
11. Bad Telemetry Resilience (If sensors lie, slow down)	`trust` signal dampens positive feedback (reward) but correctly passes negative feedback (difficulty).
12. Model-Agnosticism (Work across vendors)	The Kernel consumes dimensional `Signals` (floats), completely independent of the LLM or embeddings used.
13. Human Override (Humans remain authority)	The `reset()` method requires explicit calls. There is no autonomous self-healing from terminal failures.
14. Compliance Readiness (Support reporting)	Hash-chained JSONL trace exports. Prometheus metrics natively broadcast at `/metrics`.
15. Scalability (Scale across multiple agents)	`SystemGovernor` and `SharedBudgetPool` manage swarms globally to prevent cascading swarm failures.

📉 Budget Dynamics

Behavioral budgets (effort, persistence) are strictly bounded and monotonically depleting. Under sustained failure, the budget inevitably crosses the exhaustion threshold, forcing a terminal halt.

🛑 Failure Semantics

Failure	Trigger	Consequence
EXHAUSTION	`Effort` drops below threshold	Terminal halt
STAGNATION	Zero reward beyond stagnation window	Terminal halt
OVERRISK	`Risk` exceeds maximum limit	Immediate 403
SAFETY	`Exploration` capacity exceeded	Session severed
EXTERNAL	Hard step fuse limit reached	Instant halt

🛑 Failure Mode Progression

State transitions: Healthy → Warning (low ROI) → Critical (near boundary) → Terminal Halt (EXHAUSTION, STAGNATION, OVERRISK). A 403 permanently ends the session.

🔌 Integrations

The Harness detects native abstractions across common LLM agent frameworks and enforces boundaries transparently. Working wrapper examples are located in integrations/:

LangChain: Intercepts AgentExecutor loops (langchain_ollama.py, langchain_openai.py).
CrewAI: Limits task iterations and monitors role boundaries (crewai_ollama.py, crewai_openai.py).
AutoGen: Hooks into UserProxyAgent conversations (autogen_ollama.py, autogen_openai.py).
OpenAI SDK: Provides direct function-calling guardrails (openai_sdk.py, openai_sdk_ollama.py).

📂 Project Structure

Agent-Harness/
├── src/
│   └── governance/         # Core kernel, budgets, evaluators, mechanisms
├── problems/               # S-class problem sets and benchmarks
├── integrations/           # Connectors for LangChain, AutoGen, CrewAI, OpenAI
├── examples/               # Demonstrations of layered governance and edge cases
├── tests/                  # PyTest suites verifying mathematical limits
├── docs/                   # Deep dives and visual assets
├── deployment/             # Docker, Compose, and Dashboards
├── scripts/                # Utility and maintenance scripts
└── config/                 # YAML configuration bindings

⚡ Performance & Efficiency

Agent-Harness is designed for high-frequency runtime interception with near-zero overhead. Values below are from a real benchmark of 10,000 kernel.step() calls on commodity hardware.

Metric	Value	Notes
Step Latency (Median)	~0.06ms	Full kernel evaluation (signal processing + budget update).
Step Latency (P99)	~0.23ms	Worst-case tail latency under sustained load.
Guardrail Check (Median)	~0.07ms	Full `GuardrailStack.check_all()` with 3 detectors.
Throughput	~13k step()/sec	Single-threaded on standard hardware.
Memory (Kernel Delta)	~2 KB	Kernel + internal state after 100 steps (excludes Python runtime).

🛡️ Anti-Gaming & Robustness

Native anti-gaming logic in extractor.py:

Yapping Detection: Decays trust when agent produces high internal activity with zero environment change.
Fake Success Anchoring: Rejects agent-claimed success if env_state_delta hasn't moved.
Novelty Debt: Prevents budget resets via "new but useless" actions.
State Cycling Detection: Environment hashing penalizes oscillating behaviors.

📊 Observability and Logging

# Live terminal dashboard
python -m governance.visualizer

# Prometheus metrics export
curl http://localhost:8080/metrics

# Verify audit chain integrity
python scripts/replay_audit.py verify audit_chain.jsonl

🔐 Security Model

Fail-Closed: Exceeding any budget translates into FailureType flags and physical session termination. No auto-recovery—humans must explicitly reset().
Pattern Isolation: GuardrailStack catches malicious patterns via regex without LLM inference.
Read-Only Access: External systems cannot mutate ControlState except through deterministic evaluator signals.

🛡️ Threat Model

Agent-Harness enforces bounded execution. It is designed with a specific threat landscape in mind.

What it PROTECTS against:

Runaway Loops: Agents getting stuck hallucinating the same failing tool calls indefinitely.
Prompt Isolation Breakdown: Adversarial inputs (Prompt Injections) overriding the system prompt and hijacking the agent's goals.
Unauthorized OS Probing: LLMs attempting to execute arbitrary code (e.g., os.system(), exec()) when they shouldn't.
Unbounded Costs: Infinite loops draining LLM API token budgets.
Data Exfiltration (Basic): Pattern-based PII leakage in tool outputs or inputs.

What it DOES NOT protect against:

Semantic Hallucinations: If the underlying evaluator feeds "fake" rewards to the Harness, the Harness cannot know the progress is hallucinated.
Sophisticated Obfuscation: The current guardrails use regex/heuristics and may miss highly encoded or obfuscated attacks.
Host System Compromise: Agent-Harness bounds the agent's logic, but sandbox isolation (like Docker) is still required to secure the host machine.

🧪 Testing & Validation Rigor

Agent-Harness is validated against extreme operational conditions to ensure zero-bypass governance:

Event Horizon Benchmarks: 15 internal problem sets mapping complex semantic failures (from the Black-Hole problem set) to deterministic kernel signal patterns (RL Generalization, Distributed Consensus, etc.).
Monte Carlo Stress Tests: A high-intensity suite (monte_carlo_stress.py) that executes 12,000 randomized trajectories per run to verify that governance invariants hold under chaotic conditions.
Adversarial Pattern Suite: Unit tests verifying the GuardrailStack against known prompt injection, PII leakage, and dangerous code patterns (regex-based detection).
Multi-Agent Swarm Simulations: Validated coordination of swarm-level constraints via SharedBudgetPool and CascadeDetector tests.
Policy Enforcement Scaling: Continuous stability validation across 1,000+ sequential policy evaluations to ensure zero memory leaks or state drift.

⚠️ Limitations

Heuristic Reliance: The built-in PromptInjectionDetector, PIIDetector, and CodeExecutionGuard use hardcoded regex strategies. They may raise false positives on highly contextual payloads or miss deeply obfuscated attacks.
Signal Definition: The engine trusts the semantic signals (reward, novelty) provided by the broader orchestrator. If the orchestrator feeds "fake" rewards, the Engine cannot know it is hallucinating.
Coarse Reset Handling: Because halting is permanent for safety, workflows hitting false-positive stagnation must manage their own session checkpoints (kernel.reset()) conservatively.

🗺️ Future Projections & Research

The Agent-Harness roadmap focuses on broadening the scope of autonomous safety. Key research directions include:

OS-Level Isolation: Exploring eBPF and Kernel-level enforcement to move security boundaries outside the application runtime.
Mechanical Signal Extraction: Researching reasoning-free telemetry to eliminate evaluation bias.
Policy & Swarms: Enhancing multi-agent coordination and decentralized budget pooling.
LLM-assisted Semantic Guardrails: Moving beyond regex into fast, quantized models evaluating signal context natively.
Dynamic Threshold Selection: Automatically pivoting between CONSERVATIVE and AGGRESSIVE profiles based on historical session hashes.

For a full list of documentation and research papers, see the Master Index.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Sarthaksahu777

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.2.0

Mar 14, 2026

1.0.0

Feb 6, 2026

0.7.0

Jan 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentharnessengine-1.2.0.tar.gz (657.3 kB view details)

Uploaded Mar 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentharnessengine-1.2.0-py3-none-any.whl (75.8 kB view details)

Uploaded Mar 14, 2026 Python 3

File details

Details for the file agentharnessengine-1.2.0.tar.gz.

File metadata

Download URL: agentharnessengine-1.2.0.tar.gz
Upload date: Mar 14, 2026
Size: 657.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentharnessengine-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`408f9390859a185d54c98ed50ec8a6941a5623dbabad66e61ee24031bc8c1c6b`
MD5	`50c326cdeb8694c063faf7e81b4cd006`
BLAKE2b-256	`f6a3c817a024c3ff3c85acad3f9ae51372a9a9604d432fe1facc9cb97d4f97aa`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentharnessengine-1.2.0.tar.gz:

Publisher: publish.yml on Sarthaksahu777/Agent-Harness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentharnessengine-1.2.0.tar.gz
- Subject digest: 408f9390859a185d54c98ed50ec8a6941a5623dbabad66e61ee24031bc8c1c6b
- Sigstore transparency entry: 1101586438
- Sigstore integration time: Mar 14, 2026
Source repository:
- Permalink: Sarthaksahu777/Agent-Harness@c865a748d4e89303a96a9861fe058046f3ee5a3b
- Branch / Tag: refs/tags/v1.2.0
- Owner: https://github.com/Sarthaksahu777
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c865a748d4e89303a96a9861fe058046f3ee5a3b
- Trigger Event: release

File details

Details for the file agentharnessengine-1.2.0-py3-none-any.whl.

File metadata

Download URL: agentharnessengine-1.2.0-py3-none-any.whl
Upload date: Mar 14, 2026
Size: 75.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentharnessengine-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`79d8c50323a1cdbfaf2046446cbe57c9523ad213341ba6b923368f0e2881e829`
MD5	`4f755a17ffe689414fcbe7c801d99065`
BLAKE2b-256	`3544db86fa9b900a7b82b3bd00134117ff5a98a0d67e9944fa5fdb6165faf43e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentharnessengine-1.2.0-py3-none-any.whl:

Publisher: publish.yml on Sarthaksahu777/Agent-Harness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentharnessengine-1.2.0-py3-none-any.whl
- Subject digest: 79d8c50323a1cdbfaf2046446cbe57c9523ad213341ba6b923368f0e2881e829
- Sigstore transparency entry: 1101586443
- Sigstore integration time: Mar 14, 2026
Source repository:
- Permalink: Sarthaksahu777/Agent-Harness@c865a748d4e89303a96a9861fe058046f3ee5a3b
- Branch / Tag: refs/tags/v1.2.0
- Owner: https://github.com/Sarthaksahu777
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c865a748d4e89303a96a9861fe058046f3ee5a3b
- Trigger Event: release

agentharnessengine 1.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🛡️ Agent-Harness

📑 Table of Contents

❓ What is Agent-Harness?

🛑 Motivation / Problem Statement

💎 Why Agent-Harness?

⚖️ With vs. Without Agent-Harness

🏢 How Agent-Harness Compares

🔧 How Agent-Harness Is Useful

🚀 Deployment & Installation

Local Engine Setup

Docker Proxy (Production Ready)

Quick Start (Engine Only)

🏗️ Architecture

LEVEL 1: High-Level System Model

LEVEL 2: Governance Pipeline

LEVEL 3: Budget Decision Loop

Example System Behavior

📦 Core Components

✅ The 15-Point AI Governance Checklist

📉 Budget Dynamics

🛑 Failure Semantics

🛑 Failure Mode Progression

🔌 Integrations

📂 Project Structure

⚡ Performance & Efficiency

🛡️ Anti-Gaming & Robustness

📊 Observability and Logging

🔐 Security Model

🛡️ Threat Model

🧪 Testing & Validation Rigor

⚠️ Limitations

🗺️ Future Projections & Research

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance