Skip to main content

An exploration of making an agent sdk as lean as possible while being effective.

Project description

minimal-harness

Documentation: /docs

A lightweight Python agent harness for building LLM-powered agents with tool-calling support.

Latest version: 0.6.0a1

What This Project Is For

Minimal-harness is a lean framework for building agents that can call tools. It provides:

  • OpenAI/Anthropic-compatible API - Works with OpenAI, Anthropic, or any OpenAI-compatible API provider
  • Multi-modal image input - Pass image URLs or base64 data to LLM providers supporting vision
  • Symmetric Registry + Factory architecture - Register tool/agent metadata with bindings (LocalToolBinding, RemoteToolBinding, ExternalScriptToolBinding); executable instances created lazily by ToolFactory
  • Middleware hooks - Observe and intercept the agent lifecycle (agent start/end, LLM calls, tool execution, tool policy enforcement)
  • AsyncIterator events - Real-time async iteration for chunks, tool start/end, execution events
  • Conversation memory sessions - Persistent sessions with identity (user_id, scenario_id), auto-persisted to disk
  • Remote agents & tools - Execute agents and tools remotely via SSE over HTTP; pluggable driver/executor protocols
  • Batch evaluation - Built-in eval module for running agent evaluation suites and generating reports
  • ESC stop support - Gracefully stop LLM streaming and tool execution

Architecture

The framework uses a three-layer architecture:

Layer 3: Application (TUI client)
Layer 2: Service Abstractions (AgentRuntime, Registry, SessionStore, Factory, Remote drivers)
Layer 1: Core Abstractions (Agent, Tool, Memory, LLMProvider, AgentEvent/ToolEvent)

All event types are defined in src/minimal_harness/types.py. No separate client event layer exists.

Event flow:

async for event in agent.run(
    user_input=[{"type": "text", "text": "..."}],
    memory=memory,
    tools=tools,
):
    if isinstance(event, LLMChunk):
        # handle chunk
    elif isinstance(event, ToolEnd):
        # handle tool result

How to Build an App

Project Structure

A typical app looks like this:

my-app/
├── cli.py          # Entry point
└── tools.py        # Your custom tools

1a. Layer 1 — Direct Control

import argparse
import asyncio
from openai import AsyncOpenAI

from minimal_harness.agent.simple import SimpleAgent
from minimal_harness.llm.openai import OpenAILLMProvider
from minimal_harness.memory import ConversationMemory
from minimal_harness.tool.built_in.bash import get_tools as get_bash_tools
from minimal_harness.types import (
    AgentStart,
    AgentEnd,
    LLMChunk,
    ToolStart,
    ToolEnd,
)

def main():
    parser = argparse.ArgumentParser(description="My AI agent")
    parser.add_argument("--base-url", required=True)
    parser.add_argument("--api-key", required=True)
    parser.add_argument("--model", default="qwen3.5-27b")
    args = parser.parse_args()

    client = AsyncOpenAI(base_url=args.base_url, api_key=args.api_key)
    llm_provider = OpenAILLMProvider(client=client, model=args.model)
    agent = SimpleAgent(llm_provider=llm_provider, max_iterations=50)
    memory = ConversationMemory()
    tools = list(get_bash_tools().values())

    async def run():
        stop_event = asyncio.Event()
        context = {"user_id": "abc123"}  # passed to middleware hooks
        async for event in agent.run(
            user_input=[{"type": "text", "text": "What files are in the current directory?"}],
            stop_event=stop_event,
            memory=memory,
            tools=tools,
            context=context,
        ):
            if isinstance(event, AgentStart):
                print("Agent starting...")
            elif isinstance(event, LLMChunk):
                delta = event.chunk
                if delta and delta.content:
                    print(delta.content, end="", flush=True)
            elif isinstance(event, ToolStart):
                print(f"\n[Calling tool: {event.tool_call['function']['name']}]")
            elif isinstance(event, ToolEnd):
                print(f"\n[Tool result: {str(event.result)[:100]}...]")
            elif isinstance(event, AgentEnd):
                print(f"\n[Done in {event.time_taken:.2f}s]")
                break

    asyncio.run(run())

if __name__ == "__main__":
    main()

1b. Layer 2 — Managed Orchestration

from minimal_harness.agent.runtime import AgentRuntime
from minimal_harness.agent.registry import AgentRegistry
from minimal_harness.tool.registry import ToolRegistry, collect_builtin_tools
from minimal_harness.client.built_in.memory_store import DiskSessionStore
from minimal_harness.types import AgentMetadata

tool_registry = ToolRegistry()
await collect_builtin_tools(tool_registry)

agent_registry = AgentRegistry()
await agent_registry.register(AgentMetadata(
    name="assistant", display_name="Assistant",
    description="General assistant",
    system_prompt="You are helpful.", agent_type="simple",
    tool_names=["bash", "local_file_operation"],
))

store = DiskSessionStore()
runtime = AgentRuntime(
    agent_registry=agent_registry,
    session_store=store,
    tool_registry=tool_registry,
    llm_provider_factory=lambda: create_llm_provider(...),
)
await runtime.register_runtime_tools()

session = await store.create_session()
task, stop, queue = runtime.run(
    user_input=[{"type": "text", "text": user_message}],
    agent_metadata_id="assistant",
    memory_id=session.session_id,
)

2. Add Custom Tools

Tools are defined as async generator functions and registered via ToolMetadata + Binding:

from minimal_harness.tool.registry import ToolRegistry
from minimal_harness.types import ToolMetadata, LocalToolBinding

registry = ToolRegistry()

async def get_weather(location: str) -> AsyncIterator[dict]:
    yield {"success": True, "result": f"The weather in {location} is sunny."}

await registry.register(ToolMetadata(
    name="get_weather",
    display_name="Get Weather",
    description="Get weather for a location",
    parameters={
        "type": "object",
        "properties": {"location": {"type": "string"}},
        "required": ["location"],
    },
    binding=LocalToolBinding(fn=get_weather),
))

Or use the @register_tool decorator (recommended pattern — omit registry and call register_decorated_tools() during async setup):

from minimal_harness.tool.registration import register_tool, register_decorated_tools

@register_tool(
    name="get_weather",
    description="Get weather for a location",
    parameters={
        "type": "object",
        "properties": {"location": {"type": "string"}},
        "required": ["location"],
    },
    # registry=...  # optional — see below
)
async def get_weather(location: str) -> AsyncIterator[dict]:
    yield {"success": True, "result": f"The weather in {location} is sunny."}

# Later, during async setup:
await register_decorated_tools(registry)

For remote tools, use RemoteToolBinding:

from minimal_harness.types import RemoteToolBinding

await registry.register(ToolMetadata(
    name="weather",
    description="Get weather",
    parameters={...},
    binding=RemoteToolBinding(url="https://my-service.com/weather"),
))

For external script tools, use ExternalScriptToolBinding:

from minimal_harness.types import ExternalScriptToolBinding

await registry.register(ToolMetadata(
    name="my_tool",
    description="...",
    parameters={...},
    binding=ExternalScriptToolBinding(script_path="/path/to/tool.py"),
))

Localized tool output: Tools can detect the user's language at runtime via get_current_locale():

from minimal_harness.agent.runtime import get_current_locale

async def my_tool() -> AsyncIterator[dict]:
    locale = get_current_locale()
    yield {"message": "你好" if locale == "zh" else "Hello"}

3. Run

python cli.py --base-url https://api.openai.com/v1 --api-key sk-... --model gpt-4o

Or set environment variables:

export MH_BASE_URL=https://api.openai.com/v1
export MH_API_KEY=sk-...
export MH_MODEL=gpt-4o
python cli.py

Middleware Hooks

Subclass Middleware to observe or intercept the agent lifecycle:

from minimal_harness.agent.middleware import Middleware
from minimal_harness.types import LLMEnd, ToolCall

class PolicyEnforcer(Middleware):
    async def should_allow_tool(
        self, tool_call: ToolCall, **kwargs
    ) -> bool | str:
        if tool_call["function"]["name"] == "bash":
            return "bash is not permitted in this context"
        return True

    async def on_llm_end(self, event: LLMEnd) -> None:
        if event.usage:
            print(f"Tokens: {event.usage['total_tokens']}")

Pass middleware to SimpleAgent:

agent = SimpleAgent(
    llm_provider=llm_provider,
    middleware=[PolicyEnforcer()],
    max_iterations=50,
)

Multi-modal Image Input

Pass image URLs or base64-encoded image data as input content parts:

user_input = [
    {"type": "text", "text": "What's in this image?"},
    {
        "type": "image",
        "image_url": {"url": "https://example.com/photo.jpg"},
    },
]

For local images, encode as base64:

import base64

with open("photo.jpg", "rb") as f:
    data = base64.b64encode(f.read()).decode()

user_input = [
    {"type": "text", "text": "Describe this image"},
    {
        "type": "image",
        "data": data,
        "media_type": "image/jpeg",
    },
]

Built-in Tools

Register them in bulk via collect_builtin_tools():

from minimal_harness.tool.registry import collect_builtin_tools
await collect_builtin_tools(tool_registry)  # returns set[str] of names
Tool Description
bash Execute shell commands with timeout and workdir support
local_file_operation Read, write, patch, or delete files (4 universal modes)

Event Types

All events are defined in minimal_harness.types and consumed as a single AgentEvent union:

Event Fields Description
AgentStart user_input, timestamp Agent execution started
AgentEnd response, time_taken, exceeded, interrupted Agent execution completed
LLMStart messages, tools LLM generation started
LLMChunk chunk: LLMChunkDelta | None LLM output chunk received
LLMEnd content, reasoning_content, tool_calls, usage LLM generation completed
ExecutionStart tool_calls Tool execution started
ExecutionEnd results Tool execution completed
ToolStart tool_call Tool call started
ToolProgress tool_call, chunk Tool intermediate progress
ToolEnd tool_call, result Tool call completed with result
MemoryUpdate usage Memory token usage updated

LLMChunkDelta contains content, reasoning, and tool_calls fields for provider-agnostic partial deltas.

Batch Evaluation

The eval module runs agent evaluation suites and generates metrics reports:

python -m minimal_harness.eval.runner \
    --eval-suite my_suite.json \
    --results-dir ./eval_results
from minimal_harness.eval.runner import EvalRunner
from minimal_harness.eval.types import EvalCase

runner = EvalRunner(registry, runtime)
report = await runner.run([
    EvalCase(input="Sort [3,1,2]", expected="[1,2,3]"),
])
print(report.summary())  # pass_rate, avg_score, etc.

See docs/eval-guide.md for details.

Remote Agents

Register agents that execute on a remote service via SSE over HTTP:

from minimal_harness.types import AgentMetadata, RemoteAgentBinding

await agent_registry.register(AgentMetadata(
    name="remote_coder",
    binding=RemoteAgentBinding(
        url="https://my-agent-service.example.com/run",
        headers={"Authorization": "Bearer xxx"},
    ),
))

This creates a RemoteAgent backed by SSEAgentDriver. Implement RemoteAgentDriver for custom transports.

Environment Variables

Variable Description
MH_BASE_URL API base URL
MH_API_KEY API key
MH_MODEL Model name (default: qwen3.5-27b)
MH_MAX_ITERATIONS Max agent loop iterations (default: 50)
MH_THEME TUI theme name (default: tokyo-night)

Stop Mechanism

Press ESC during execution to gracefully stop LLM streaming and tool execution.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

minimal_harness-0.6.0a1.tar.gz (72.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

minimal_harness-0.6.0a1-py3-none-any.whl (107.2 kB view details)

Uploaded Python 3

File details

Details for the file minimal_harness-0.6.0a1.tar.gz.

File metadata

  • Download URL: minimal_harness-0.6.0a1.tar.gz
  • Upload date:
  • Size: 72.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for minimal_harness-0.6.0a1.tar.gz
Algorithm Hash digest
SHA256 ff9d0f5c574ec373a39613506299c95089523e9546a1e21df4e619972878f740
MD5 8665d2c1f86a8e1cc04208bec95e5ede
BLAKE2b-256 677e1c130960a78b2bfe8eff13ee098d7ce2bd8c2db614eaf88c0050e5694bdb

See more details on using hashes here.

File details

Details for the file minimal_harness-0.6.0a1-py3-none-any.whl.

File metadata

  • Download URL: minimal_harness-0.6.0a1-py3-none-any.whl
  • Upload date:
  • Size: 107.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for minimal_harness-0.6.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 da268745700612e617fbed125a78a7474199cd9ace00055f23e5244af5457e56
MD5 948c432555b88c5fbb7fab61e5a9831a
BLAKE2b-256 c7b6a9a87c642718956dd060a7a363c86386f571c68a99eb2234d83f7efea2c6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page