Production-grade reusable AI agent infrastructure base

Project description

llm-harness

中文介绍 · English Introduction · GitHub

Production-grade reusable agent infrastructure base — ~10,000 lines, 290 tests.

Build an AI agent by defining your tools, writing your skills, and choosing a provider. Everything else — ReAct loop, tool pipeline, permissions, hooks, session persistence, memory consolidation, observability — is handled by the harness.

from agent_harness import AgentLoop, LoopCallbacks, ToolRegistry, AnthropicProvider

tools = ToolRegistry()
tools.register(MyBusinessTool())

callbacks = LoopCallbacks(
    build_messages=...,               # your system prompt
    execute_tool=...,                 # your tool execution
    get_tool_definitions=lambda: tools.to_api_schema("anthropic"),
)

agent = AgentLoop(AnthropicProvider(api_key="..."), callbacks)
result = await agent.process_direct("Do the thing")
print(result.final_content)

Why This Exists

Option	Problem
LangChain/LangGraph	300K+ lines, 50+ dependencies, constant API churn
From scratch	Rebuild loop, retry, registry, session, permissions... every time
llm-harness	~10K lines. Read in an afternoon. Fork without fear. 290 tests

Architecture

Each tool call goes through:
  LLM → Permission.check → Hook.execute(PRE_TOOL_USE) → Tool.execute → Hook.execute(POST_TOOL_USE) → LLM

Each conversation turn goes through:
  Message → AgentLoop → Provider.chat_with_retry → (tool calls? → execute → loop) → Text response

Every event flows through:
  Any module → EventBus → Tracker (JSONL file) / Prometheus / Dashboard

llm-harness/
  loop/             ReAct skeleton + concurrency (per-session Lock + Semaphore)
  tools/            24 built-in tools + config-driven builder
  providers/        Anthropic + OpenAI-compatible (25 backends), retry + backoff
  permissions/      Sensitive path protection, 3 modes, path/cmd rules
  hooks/            PreToolUse/PostToolUse, 4 hook types (cmd/http/prompt/agent)
  security/         SSRF protection (DNS + private IP blocking)
  sandbox/          OS-level isolation (srt CLI wrapper)
  session/          JSONL persistence + legal boundary alignment
  memory/           Two-tier (MEMORY.md + HISTORY.md) + LLM consolidation
  skills/           .md loading + dependency checking
  cron/             Scheduler (at/every/cron) + persistence
  mcp/              MCP stdio/SSE/HTTP, tools as BaseTool subclasses
  channels/         BaseChannel ABC + ChannelManager (WebSocket, Telegram...)
  commands/         4-tier slash command router
  plugins/          Discovery + manifest loading
  auth/             Credential storage (file + keyring + encryption)
  prompts/          AGENTS.md discovery + environment + SectionProviders
  tasks/            Background subprocess manager + stdout capture
  coordinator/      Subagent spawning with restricted tools
  state/            Observable state store (get/set/subscribe)
  config/           Multi-layer (CLI > env > file > defaults)
  observability/    Structured events + EventBus + JSONL tracker (auto-start)

Quick Start

pip install llm-harness[all]

import asyncio
from pathlib import Path
from agent_harness import (
    AgentLoop, LoopCallbacks, ToolRegistry, BaseTool,
    ToolResult, ToolExecutionContext, AnthropicProvider,
)
from pydantic import BaseModel, Field

class GreetInput(BaseModel):
    name: str = Field(description="Who to greet")

class GreetTool(BaseTool):
    name = "greet"
    description = "Greet someone"
    input_model = GreetInput

    async def execute(self, args, ctx):
        return ToolResult(output=f"Hello, {args.name}!")

tools = ToolRegistry()
tools.register(GreetTool())

async def _exec(tools, name, args):
    tool = tools.get(name)
    parsed = tool.input_model.model_validate(args)
    result = await tool.execute(parsed, ToolExecutionContext(cwd=Path.cwd()))
    return result.output

callbacks = LoopCallbacks(
    build_messages=lambda msg: [
        {"role": "system", "content": "You are a friendly assistant."},
        {"role": "user", "content": msg.content},
    ],
    execute_tool=lambda name, args: _exec(tools, name, args),
    get_tool_definitions=lambda: tools.to_api_schema("anthropic"),
    on_event=lambda e: print(f"[{type(e).__name__}]"),  # optional observability
)

agent = AgentLoop(AnthropicProvider(api_key="..."), callbacks)

async def main():
    result = await agent.process_direct("Greet Alice!")
    print(result.final_content)

asyncio.run(main())

Config-Driven Setup

{
  "agent": { "model": "claude-sonnet-4-6" },
  "tools": { "enabled": ["web_search", "message", "write_memory"] },
  "permission": { "mode": "default" },
  "observability": { "track_file": "~/.llm-harness/track.jsonl" }
}

from agent_harness import load_config, build_tools_from_config, start_tracker_from_config

config = load_config()
tools = build_tools_from_config(config.tools)
tracker = await start_tracker_from_config(config)  # auto-starts if configured

Observability

Zero-config by default. Set observability.track_file in config to auto-start JSONL tracking:

{"type":"SessionOpened","ts":"...","data":{"session_key":"cli:test"}}
{"type":"ToolExecutionStarted","ts":"...","data":{"tool_name":"web_search","tool_input":{...}}}
{"type":"ToolExecutionCompleted","ts":"...","data":{"tool_name":"web_search","output":"...","is_error":false,"duration_ms":123.4}}
{"type":"AssistantTurnComplete","ts":"...","data":{"content":"Done","usage":{"prompt_tokens":10,"completion_tokens":5}}}

Or subscribe programmatically for real-time metrics:

from agent_harness.observability import get_event_bus

async def prometheus_collector(event):
    if isinstance(event, ToolExecutionCompleted):
        histogram(f"tool.{event.tool_name}.latency_ms", event.duration_ms)

get_event_bus().subscribe(prometheus_collector)

Deployment

# One Deployment per agent scenario
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cs-agent
spec:
  replicas: 3
  template:
    spec:
      containers:
      - image: llm-harness:latest
        env:
        - name: AGENT_SCENARIO
          value: "customer-service"
        volumeMounts:
        - name: tools
          mountPath: /app/tools
        - name: skills
          mountPath: /app/skills

Kafka: topic:customer-service → cs-agent (3 pods)
       topic:code-review      → cr-agent (2 pods)
       topic:ops-automation   → ops-agent (1 pod)

Installation

pip install llm-harness               # base
pip install llm-harness[anthropic]    # + Claude
pip install llm-harness[openai]       # + OpenAI
pip install llm-harness[all]          # everything
pip install llm-harness[dev]          # + pytest, ruff

Requirements

Core: Python >= 3.10, pydantic >= 2.0, httpx >= 0.27, pyyaml >= 6.0, mcp >= 1.0, croniter >= 2.0, json-repair >= 0.57 Optional: anthropic, openai, ddgs, readability-lxml

Tests

290 passed, 9 skipped, 0 failed

9 skipped are optional dependency tests (ddgs, readability-lxml). Install those packages to enable them.

Design Principles

Callback injection, not inheritance. LoopCallbacks dataclass holds all app-specific behavior. The loop knows nothing about your tools, channels, or prompts.
Config-driven. Switch agent behavior via JSON. Tools, permissions, provider, sandbox, observability — all configurable without code changes.
Transport-agnostic. BaseChannel defines the contract. WebSocket, HTTP, gRPC, Telegram — same interface.
You own the code. ~10,000 lines. Fork it. Modify it. No framework to learn.
Production observability. Structured events, EventBus, JSONL tracker, auto-start from config. Zero overhead when disabled.

License

MIT — see LICENSE.

Credits

Extracted and refined from two mature open-source agent projects:

OpenHarness — tools, permissions, hooks, skills, sandbox, plugins, tasks
nanobot — agent loop, providers, message bus, session, memory, cron, channels

Project details

Release history Release notifications | RSS feed

This version

0.2.0

May 24, 2026

0.1.1

May 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_harness-0.2.0.tar.gz (283.7 kB view details)

Uploaded May 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_harness-0.2.0-py3-none-any.whl (167.9 kB view details)

Uploaded May 24, 2026 Python 3

File details

Details for the file llm_harness-0.2.0.tar.gz.

File metadata

Download URL: llm_harness-0.2.0.tar.gz
Upload date: May 24, 2026
Size: 283.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for llm_harness-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`ff21624af3c2067c0c33f752103014298720aca848a0df6ebab44f1d941d4f98`
MD5	`4a569508327df529c9d43334a01da35c`
BLAKE2b-256	`c204c0f3ca96fdd1b3d491b4bda4bcdfd3fc8446483dac5f9650e577add0eeb1`

See more details on using hashes here.

File details

Details for the file llm_harness-0.2.0-py3-none-any.whl.

File metadata

Download URL: llm_harness-0.2.0-py3-none-any.whl
Upload date: May 24, 2026
Size: 167.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for llm_harness-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d92d7ae255990661894a7f23a11601bb0f93063c077f61854697454821546c78`
MD5	`a972dfb468fb73e5d671f0b5cba628de`
BLAKE2b-256	`478adaff8af5ea4002a6d57001b2cdaacf40fac6904e5daf3c4c5d794991347d`

See more details on using hashes here.

llm-harness 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

llm-harness

Why This Exists

Architecture

Quick Start

Config-Driven Setup

Observability

Deployment

Installation

Requirements

Tests

Design Principles

License

Credits

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes