Skip to main content

An exploration of making an agent sdk as lean as possible while being effective.

Project description

minimal-harness

Documentation: /docs

A lightweight Python agent harness for building LLM-powered agents with tool-calling support.

Latest version: 0.6.0.post1

What This Project Is For

Minimal-harness is a lean framework for building agents that can call tools. It provides:

  • OpenAI/Anthropic-compatible API - Works with OpenAI, Anthropic, or any OpenAI-compatible API provider
  • Multi-modal image input - Pass image URLs or base64 data to LLM providers supporting vision
  • Symmetric Registry + Factory architecture - Register tool/agent metadata with bindings (LocalToolBinding, RemoteToolBinding, ExternalScriptToolBinding); executable instances created lazily by ToolFactory
  • Middleware hooks - Observe and intercept the agent lifecycle (agent start/end, LLM calls, tool execution, tool policy enforcement)
  • AsyncIterator events - Real-time async iteration for chunks, tool start/end, execution events
  • Conversation memory sessions - Persistent sessions with identity (user_id, scenario_id), auto-persisted to disk
  • Remote agents & tools - Execute agents and tools remotely via SSE over HTTP; pluggable driver/executor protocols
  • Batch evaluation - Built-in eval module for running agent evaluation suites and generating reports
  • ESC stop support - Gracefully stop LLM streaming and tool execution

Architecture

The framework uses a three-layer architecture:

Layer 3: Application (TUI client)
Layer 2: Service Abstractions (AgentRuntime, Registry, SessionStore, Factory, Remote drivers)
Layer 1: Core Abstractions (Agent, Tool, Memory, LLMProvider, AgentEvent/ToolEvent)

All event types are defined in src/minimal_harness/types.py. No separate client event layer exists.

Event flow:

async for event in agent.run(
    user_input=[{"type": "text", "text": "..."}],
    memory=memory,
    tools=tools,
):
    if isinstance(event, LLMChunk):
        # handle chunk
    elif isinstance(event, ToolEnd):
        # handle tool result

How to Build an App

Project Structure

A typical app looks like this:

my-app/
├── cli.py          # Entry point
└── tools.py        # Your custom tools

1a. Layer 1 — Direct Control

import argparse
import asyncio
from openai import AsyncOpenAI

from minimal_harness.agent.simple import SimpleAgent
from minimal_harness.llm.openai import OpenAILLMProvider
from minimal_harness.memory import ConversationMemory
from minimal_harness.tool.built_in.bash import get_tools as get_bash_tools
from minimal_harness.types import (
    AgentStart,
    AgentEnd,
    LLMChunk,
    ToolStart,
    ToolEnd,
)

def main():
    parser = argparse.ArgumentParser(description="My AI agent")
    parser.add_argument("--base-url", required=True)
    parser.add_argument("--api-key", required=True)
    parser.add_argument("--model", default="deepseek-v4-flash")
    args = parser.parse_args()

    client = AsyncOpenAI(base_url=args.base_url, api_key=args.api_key)
    llm_provider = OpenAILLMProvider(client=client, model=args.model)
    agent = SimpleAgent(llm_provider=llm_provider, max_iterations=50)
    memory = ConversationMemory()
    tools = list(get_bash_tools().values())

    async def run():
        stop_event = asyncio.Event()
        context = {"user_id": "abc123"}  # passed to middleware hooks
        async for event in agent.run(
            user_input=[{"type": "text", "text": "What files are in the current directory?"}],
            stop_event=stop_event,
            memory=memory,
            tools=tools,
            context=context,
        ):
            if isinstance(event, AgentStart):
                print("Agent starting...")
            elif isinstance(event, LLMChunk):
                delta = event.chunk
                if delta and delta.content:
                    print(delta.content, end="", flush=True)
            elif isinstance(event, ToolStart):
                print(f"\n[Calling tool: {event.tool_call['function']['name']}]")
            elif isinstance(event, ToolEnd):
                print(f"\n[Tool result: {str(event.result)[:100]}...]")
            elif isinstance(event, AgentEnd):
                print(f"\n[Done in {event.time_taken:.2f}s]")
                break

    asyncio.run(run())

if __name__ == "__main__":
    main()

1b. Layer 2 — Managed Orchestration

from minimal_harness.agent.runtime import AgentRuntime
from minimal_harness.agent.registry import AgentRegistry
from minimal_harness.tool.registry import ToolRegistry, collect_builtin_tools
from minimal_harness.client.built_in.memory_store import DiskSessionStore
from minimal_harness.types import AgentMetadata

tool_registry = ToolRegistry()
await collect_builtin_tools(tool_registry)

agent_registry = AgentRegistry()
await agent_registry.register(AgentMetadata(
    name="assistant", display_name="Assistant",
    description="General assistant",
    system_prompt="You are helpful.", agent_type="simple",
    tool_names=["bash", "local_file_operation"],
))

store = DiskSessionStore()
runtime = AgentRuntime(
    agent_registry=agent_registry,
    session_store=store,
    tool_registry=tool_registry,
    llm_provider_factory=lambda: create_llm_provider(...),
)
await runtime.register_runtime_tools()

session = await store.create_session()
task, stop, queue = runtime.run(
    user_input=[{"type": "text", "text": user_message}],
    agent_metadata_id="assistant",
    memory_id=session.session_id,
)

2. Add Custom Tools

Tools are defined as async generator functions and registered via ToolMetadata + Binding:

from minimal_harness.tool.registry import ToolRegistry
from minimal_harness.types import ToolMetadata, LocalToolBinding

registry = ToolRegistry()

async def get_weather(location: str) -> AsyncIterator[dict]:
    yield {"success": True, "result": f"The weather in {location} is sunny."}

await registry.register(ToolMetadata(
    name="get_weather",
    display_name="Get Weather",
    description="Get weather for a location",
    parameters={
        "type": "object",
        "properties": {"location": {"type": "string"}},
        "required": ["location"],
    },
    binding=LocalToolBinding(fn=get_weather),
))

Or use the @register_tool decorator (recommended pattern — omit registry and call register_decorated_tools() during async setup):

from minimal_harness.tool.registration import register_tool, register_decorated_tools

@register_tool(
    name="get_weather",
    description="Get weather for a location",
    parameters={
        "type": "object",
        "properties": {"location": {"type": "string"}},
        "required": ["location"],
    },
    # registry=...  # optional — see below
)
async def get_weather(location: str) -> AsyncIterator[dict]:
    yield {"success": True, "result": f"The weather in {location} is sunny."}

# Later, during async setup:
await register_decorated_tools(registry)

For remote tools, use RemoteToolBinding:

from minimal_harness.types import RemoteToolBinding

await registry.register(ToolMetadata(
    name="weather",
    description="Get weather",
    parameters={...},
    binding=RemoteToolBinding(url="https://my-service.com/weather"),
))

For external script tools, use ExternalScriptToolBinding:

from minimal_harness.types import ExternalScriptToolBinding

await registry.register(ToolMetadata(
    name="my_tool",
    description="...",
    parameters={...},
    binding=ExternalScriptToolBinding(script_path="/path/to/tool.py"),
))

Localized tool output: Tools can detect the user's language at runtime via get_current_locale():

from minimal_harness.agent.runtime import get_current_locale

async def my_tool() -> AsyncIterator[dict]:
    locale = get_current_locale()
    yield {"message": "你好" if locale == "zh" else "Hello"}

3. Run

python cli.py --base-url https://api.openai.com/v1 --api-key sk-... --model gpt-4o

Or set environment variables:

export MH_BASE_URL=https://api.openai.com/v1
export MH_API_KEY=sk-...
export MH_MODEL=gpt-4o
python cli.py

Middleware Hooks

Subclass Middleware to observe or intercept the agent lifecycle:

from minimal_harness.agent.middleware import Middleware
from minimal_harness.types import LLMEnd, ToolCall

class PolicyEnforcer(Middleware):
    async def should_allow_tool(
        self, tool_call: ToolCall, **kwargs
    ) -> bool | str:
        if tool_call["function"]["name"] == "bash":
            return "bash is not permitted in this context"
        return True

    async def on_llm_end(self, event: LLMEnd) -> None:
        if event.usage:
            print(f"Tokens: {event.usage['total_tokens']}")

Pass middleware to SimpleAgent:

agent = SimpleAgent(
    llm_provider=llm_provider,
    middleware=[PolicyEnforcer()],
    max_iterations=50,
)

Multi-modal Image Input

Pass image URLs or base64-encoded image data as input content parts:

user_input = [
    {"type": "text", "text": "What's in this image?"},
    {
        "type": "image",
        "image_url": {"url": "https://example.com/photo.jpg"},
    },
]

For local images, encode as base64:

import base64

with open("photo.jpg", "rb") as f:
    data = base64.b64encode(f.read()).decode()

user_input = [
    {"type": "text", "text": "Describe this image"},
    {
        "type": "image",
        "data": data,
        "media_type": "image/jpeg",
    },
]

Built-in Tools

Register them in bulk via collect_builtin_tools():

from minimal_harness.tool.registry import collect_builtin_tools
await collect_builtin_tools(tool_registry)  # returns set[str] of names
Tool Description
bash Execute shell commands with timeout and workdir support
local_file_operation Read, write, patch, or delete files (4 universal modes)

Event Types

All events are defined in minimal_harness.types and consumed as a single AgentEvent union:

Event Fields Description
AgentStart user_input, timestamp Agent execution started
AgentEnd response, time_taken, exceeded, interrupted Agent execution completed
LLMStart messages, tools LLM generation started
LLMChunk chunk: LLMChunkDelta | None LLM output chunk received
LLMEnd content, reasoning_content, tool_calls, usage LLM generation completed
ExecutionStart tool_calls Tool execution started
ExecutionEnd results Tool execution completed
ToolStart tool_call Tool call started
ToolProgress tool_call, chunk Tool intermediate progress
ToolEnd tool_call, result Tool call completed with result
MemoryUpdate usage Memory token usage updated
MessageEvent message Conversation message added to memory

LLMChunkDelta contains content, reasoning, and tool_calls fields for provider-agnostic partial deltas.

Batch Evaluation

The eval module runs agent evaluation suites and generates metrics reports:

python -m minimal_harness.eval.runner \
    --eval-suite my_suite.json \
    --results-dir ./eval_results
from minimal_harness.eval.runner import EvalRunner
from minimal_harness.eval.types import EvalCase

runner = EvalRunner(registry, runtime)
report = await runner.run([
    EvalCase(input="Sort [3,1,2]", expected="[1,2,3]"),
])
print(report.summary())  # pass_rate, avg_score, etc.

See docs/eval-guide.md for details.

Remote Agents

Register agents that execute on a remote service via SSE over HTTP:

from minimal_harness.types import AgentMetadata, RemoteAgentBinding

await agent_registry.register(AgentMetadata(
    name="remote_coder",
    binding=RemoteAgentBinding(
        url="https://my-agent-service.example.com/run",
        headers={"Authorization": "Bearer xxx"},
    ),
))

This creates a RemoteAgent backed by SSEAgentDriver. Implement RemoteAgentDriver for custom transports.

Environment Variables

Variable Description
MH_BASE_URL API base URL (default: https://aihubmix.com/v1)
MH_API_KEY API key
MH_MODEL Model name (default: deepseek-v4-flash)
MH_MAX_ITERATIONS Max agent loop iterations (default: 100)
MH_THEME TUI theme name (default: tokyo-night)

Stop Mechanism

Press ESC during execution to gracefully stop LLM streaming and tool execution.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

minimal_harness-0.6.1a10.tar.gz (94.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

minimal_harness-0.6.1a10-py3-none-any.whl (134.5 kB view details)

Uploaded Python 3

File details

Details for the file minimal_harness-0.6.1a10.tar.gz.

File metadata

  • Download URL: minimal_harness-0.6.1a10.tar.gz
  • Upload date:
  • Size: 94.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for minimal_harness-0.6.1a10.tar.gz
Algorithm Hash digest
SHA256 7f26326d09fdb7e4d51bd3757dcb8e561ed8afbca821e40e19811f8ddab6650d
MD5 8058e36329ded23f8831b368de11fad2
BLAKE2b-256 21aa3aba6b6971362a0d962d4e7f07d7b3b4e34e998aed41315ff5c70ffc5ca8

See more details on using hashes here.

File details

Details for the file minimal_harness-0.6.1a10-py3-none-any.whl.

File metadata

  • Download URL: minimal_harness-0.6.1a10-py3-none-any.whl
  • Upload date:
  • Size: 134.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for minimal_harness-0.6.1a10-py3-none-any.whl
Algorithm Hash digest
SHA256 9c926b8e066d7a60c1f1f03107ae2ae7498c737ea8b446d7854b1f5eeb636f01
MD5 253e02ac503838851c08376a3f91044a
BLAKE2b-256 909fbfd2c4b5af7b4ec17cb5217b564210bda0ada8dbc0da5a6a8eadc3420007

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page