Skip to main content

Minimal filesystem agent harness with provider-backed Responses-like models.

Project description

ThinHarness


A minimal, opinionated agent harness —
focused scope, readable core, easy to fork.

CI License PyPI

Why this exists

Filesystem-based agent harnesses are simple but powerful: easily auditable, flexible, and they work just as well for non-coding business tasks like research over a corpus, workflow automation, or multi-step analysis. But the harnesses that provide filesystem primitives are either coding agents (Claude Agent SDK) or are massive and highly abstracted (deepagents, Agno). Even if you don't want filesystem tools, the general-purpose agent harness libraries are missing features (see table below) — or large enough that it's a pain when you (inevitably) need to customize.

So I built one. The core agent loop isn't that complicated. Provider call, parse tool calls, run them, feed results back, repeat. ThinHarness is 4,938 lines of Python across 15 files. The whole thing. Small enough to actually read. You can audit it. You can fork it without inheriting a fork-maintenance problem, because there isn't much there to drift.

Library LOC1 Tool
retries2
Subagents Structured
output
Skills FS
tools
OTel
tracing
ThinHarness 4,938
 Claude Agent SDK3 8,202 ⚠️
 smolagents 10,091
 deepagents4 15,369
 AWS Strands 25,494 ⚠️
 Microsoft
Agent Framework
34,751
 Pydantic AI 51,231
 Google ADK 57,392 ⚠️
 OpenAI Agents SDK 72,410
 Agno 106,852 ⚠️

* Table focuses on features that differentiate the harnesses. All listed also support MCP, lifecycle hooks, and multi-turn conversations.

1. LOC excludes anything that is not the core agent harness framework. See raw README source comments for exact commands.
2. Tool retries: a documented primitive (e.g. Pydantic AI's ModelRetry) that lets tools signal "model passed bad args — retry with this feedback," distinct from generic exception propagation.
3. Claude Agent SDK shells out to the Claude Code CLI binary, which is 200k+ LOC.
4. deepagents is a thin wrapper over LangChain/LangGraph; effective import surface is ≈105k LOC.


See docs/table.md for per-cell rationale and how the LOC numbers are measured.

Opinions

ThinHarness has opinions. They are the reason it stays small.

No bash. Business agents don't need a shell. Bash is a giant security surface, and agents mess up when writing shell commands more often than you'd initially expect. Cut it and most of those failures stop being possible.

Skills are tools, not auto-discovery. Skills live in directories you point at explicitly. The agent calls skill_read and skill_run like any other tool. No interactive scan of the workspace, no global skill marketplace, no magic. SDK use is deliberate; the auto-discovery design is for interactive coding agents and doesn't belong here.

Search is a top priority. The search tool is a Python port of pgr's ranking; pgr built benchmarks for agentic search and came up with a great way of exposing ripgrep to agents without raw bash. There's also a jsonl_search variant, because JSONL is the right shape when you're replacing RAG with agent-driven search over structured data (line-delimited, naturally chunked, jq + rg).

Parallel LLM calls, built in. Fan out from inside the harness when a workflow needs reliability beyond a single agent loop — majority vote, ensembled extraction. Set builtin_parallel_llm_model to enable the default parallel_llm tool for plain-text batches; for validated structured output per call, instantiate ParallelLlmTool yourself with output_type (a Pydantic model). Each call is stateless, and large batches can write JSON to output_file.

Three providers, no matrix. ThinHarness ships small provider classes for OpenAI, Anthropic, and OpenRouter. If your gateway speaks one of those protocols, you swap a base URL and move on. If not, the provider classes are small enough to fork or replace, and ignoring the bundled ones costs you nothing

No compaction. Compaction is a workaround for context windows filling up across long, accumulating runs — useful for interactive coding sessions that sprawl over hours. For SDK-based business agents, the right answer to "context is getting big" is almost always better task decomposition: shorter runs, separate harness instances, narrower subagents.

No deployment layer. Agents still need serving, auth, storage, retries, and observability in production. ThinHarness does not try to own that stack. A bundled deployment layer might work for some teams, but it will miss plenty of real production shapes; instead of adding more code and more options, ThinHarness stays an SDK and lets the host application own deployment.

Install

uv add thinharness     # or pip install thinharness

Requires Python 3.11+.

Use

import asyncio
from thinharness import Harness, HarnessConfig

async def main():
    async with Harness(HarnessConfig(root=".", model="openai:gpt-5.2")) as harness:
        result = await harness.run("Read README.md and summarize it.")
        print(result.text)

asyncio.run(main())

There's a synchronous wrapper (Harness(...).run_sync(...)), Pydantic-typed structured output, lifecycle hooks, subagents, and path-scoped FS tools. The whole library is 15 files; the loop you care about is in thinharness/core.py and the tools live in thinharness/tools/. Reading those files is faster than reading the docs would be.

Features

  • Filesystem tools: read, write, edit, search, list, glob, and jsonl_search with root-scoped path policies.
  • Structured output: Pydantic-validated results with native, tool, prompted, and text modes.
  • Hooks: lifecycle and tool-call interception for prompt submission, tool calls, subagents, limits, and run boundaries.
  • Subagents: opt-in delegation through a built-in subagent tool and explicit SubAgentConfig.
  • Parallel LLM: opt-in parallel_llm fan-out for batches of independent one-shot prompts, plus ParallelLlmTool(...).spec() for renameable tools with explicit model, path, prompt, and retry settings.
  • Resume: clean new-turn continuation through opaque provider session state.
  • MCP: optional MCP client support with lazy tool discovery and collision checks.
  • Parallel tool calls: same-turn tool batches run concurrently when every called tool is parallel-safe.
  • Tool retries and limit notices: retryable argument/model mistakes use ModelRetry; near-limit guidance can warn the model before configured request or tool-call budgets are exhausted. Notices are harness-owned model input, not hooks or configurable callbacks. Parent and child runs compute notices from their own local budgets.
  • Tracing: OpenTelemetry-compatible spans for runs, provider calls, tools, and subagents.

Status

Pre-1.0. APIs may shift, but I don't expect dramatic changes. Forking is a real option, not just a theoretical one: the codebase is small enough that pulling upstream changes into your fork by hand stays cheap. Each major feature (MCP, subagents, jsonl_search, parallel_llm, skills) lives in its own file with no hidden dependencies. If you don't use one, that's even less code to worry about. If you want to delete it entirely, that's a one-shot 10-word prompt to a coding agent.

License

MIT. Search ranking adapted from pgr; see docs/THIRD_PARTY_NOTICES.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thinharness-0.1.0.tar.gz (125.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thinharness-0.1.0-py3-none-any.whl (67.2 kB view details)

Uploaded Python 3

File details

Details for the file thinharness-0.1.0.tar.gz.

File metadata

  • Download URL: thinharness-0.1.0.tar.gz
  • Upload date:
  • Size: 125.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for thinharness-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6b2240f89f774a2123e0e3affdb12cd5d946b4a34f417ac12ab310644e4a25bd
MD5 36288e5db36dca1633f829ac529b13de
BLAKE2b-256 0e50f311e158a73c23afa2c5789a54413ccdd663f34c725605a66d34007cd966

See more details on using hashes here.

Provenance

The following attestation bundles were made for thinharness-0.1.0.tar.gz:

Publisher: release.yml on ryanbbrown/thinharness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file thinharness-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: thinharness-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 67.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for thinharness-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 14e584f3c19c75cc818f137bbfbe200f215b7abaf65cb9cb77b47256d999cabf
MD5 6d3f60bfd13e102e86081d09ada9b93e
BLAKE2b-256 a08628d733f58d0bd8af667cd039ada12302b703c2249734b02017eddb548a45

See more details on using hashes here.

Provenance

The following attestation bundles were made for thinharness-0.1.0-py3-none-any.whl:

Publisher: release.yml on ryanbbrown/thinharness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page