Skip to main content

Fuzzy edit tool for LLM coding agents — never fail a str_replace again

Project description

🔧 HarnessKit

Fuzzy edit tool for LLM coding agents — never fail a str_replace again.

License: MIT Python 3.8+ Zero Dependencies


The Problem

Every LLM coding agent has the same Achilles' heel: edit application.

When Claude, GPT, or any model tries to modify code, it generates an old_textnew_text pair. The tool then does an exact string match to find where to apply the change. And it fails. A lot.

  • Whitespace differences — the model adds a space, drops a tab, or normalizes indentation
  • Minor hallucinations — a variable name is slightly off, a comment is paraphrased
  • Format fragility — diffs, patches, and line-number schemes all break in different ways

The result? Up to 50% edit failure rates on non-native models. Every failed edit wastes a tool call, burns tokens on retries, and breaks agent flow.

The Solution

HarnessKit (hk) is a drop-in edit tool that fuzzy-matches the old text before replacing it. It uses a 4-stage matching cascade:

  1. Exact match — zero overhead when the model is precise
  2. Normalized whitespace — catches the most common failure mode
  3. Sequence matchingdifflib.SequenceMatcher with configurable threshold (default 0.8)
  4. Line-by-line fuzzy — finds the best contiguous block match for heavily drifted edits

Every edit returns a confidence score and match type, so your agent knows exactly how the edit was resolved.

Quick Start

pip install harnesskit

Or just copy hk.py into your project — it's a single file, stdlib only.

CLI Usage

# Direct arguments
hk apply --file app.py --old "def hello():\n    print('hi')" --new "def hello():\n    print('hello world')"

# JSON from stdin (perfect for tool_use integration)
echo '{"file": "app.py", "old_text": "def hello():", "new_text": "def greet():"}' | hk apply --stdin

# From a JSON file
hk apply --edit changes.json

# Dry run — see what would change without writing
hk apply --file app.py --old "..." --new "..." --dry-run

JSON Edit Format

{
  "file": "path/to/file.py",
  "old_text": "def hello():\n    print('hi')",
  "new_text": "def hello():\n    print('hello world')"
}

Batch multiple edits:

{
  "edits": [
    {"file": "a.py", "old_text": "...", "new_text": "..."},
    {"file": "b.py", "old_text": "...", "new_text": "..."}
  ]
}

Output

{
  "status": "applied",
  "file": "app.py",
  "match_type": "fuzzy",
  "confidence": 0.92,
  "matched_text": "def hello():\n    print( 'hi' )"
}

Exit Codes

Code Meaning
0 Edit applied successfully
1 No match found
2 Ambiguous — multiple matches

MCP Server

HarnessKit ships an MCP (Model Context Protocol) server for plug-and-play integration with any MCP-compatible agent.

Quick Start

Add to your MCP client config (e.g. Claude Desktop, Cursor, etc.):

{
  "mcpServers": {
    "harnesskit": {
      "command": "python3",
      "args": ["/path/to/hk_mcp.py"]
    }
  }
}

Tools

Tool Description
harnesskit_apply Apply a fuzzy edit to a file
harnesskit_apply_batch Apply multiple edits in one call
harnesskit_match Preview the match without modifying (dry run)

Each tool returns the match type, confidence score, and matched text — giving the agent full visibility into how the edit was resolved.

Example

{
  "name": "harnesskit_apply",
  "arguments": {
    "file": "app.py",
    "old_text": "def hello():\n    print('hi')",
    "new_text": "def hello():\n    print('hello world')",
    "threshold": 0.8
  }
}

Response:

{
  "status": "applied",
  "match_type": "whitespace",
  "confidence": 0.95
}

Integration

HarnessKit is designed to slot into any agent framework as the edit backend:

import subprocess, json

def apply_edit(file, old_text, new_text):
    result = subprocess.run(
        ["hk", "apply", "--stdin"],
        input=json.dumps({"file": file, "old_text": old_text, "new_text": new_text}),
        capture_output=True, text=True
    )
    return json.loads(result.stdout)

Or import directly:

from hk import apply_edit

result = apply_edit("app.py", old_text, new_text, threshold=0.8)

Benchmarks

We tested HarnessKit against 45 realistic edit failure scenarios — the kind that break str_replace and apply_patch in production agent workflows.

Category Exact Match HarnessKit Recovery Rate
Whitespace (tabs/spaces, trailing, indentation, CRLF, nesting) 0/11 11/11 100%
Hallucinations (typos, quotes, types, multi-language) 0/16 16/16 100%
Line Drift (shifted context, extra decorators, renames) 2/5 5/5 100%
Partial Matches (subset of target) 2/2 2/2
Real-World (str_replace failures, docstring diffs) 0/6 6/6 100%
Hard (multi-error combos, brace styles, compression) 0/5 5/5 100%
Total 4/45 (9%) 45/45 (100%) 100%

Exact match succeeds 9% of the time. HarnessKit succeeds 100% of the time. 41 out of 41 failed edits recovered.

Run the benchmarks yourself:

python3 benchmarks/benchmark.py

Design Principles

  • Single file, stdlib only — copy it, vendor it, pip install it. No dependency hell.
  • 419 lines of Python — small enough to audit in one sitting
  • Graceful degradation — exact match when possible, fuzzy only when needed
  • Transparent — every result tells you how it matched and how confident it is
  • Model-agnostic — works with any LLM that can produce old/new text pairs

Configuration

Flag Default Description
--threshold 0.8 Minimum similarity score for fuzzy matching
--dry-run false Preview changes without writing to disk

Development

git clone https://github.com/alexmelges/harnesskit.git
cd harnesskit
python3 -m pytest test_hk.py test_mcp.py -v  # 53 tests

License

MIT — see LICENSE.


Built for the agents that build everything else.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harnesskit-0.3.0.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

harnesskit-0.3.0-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file harnesskit-0.3.0.tar.gz.

File metadata

  • Download URL: harnesskit-0.3.0.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for harnesskit-0.3.0.tar.gz
Algorithm Hash digest
SHA256 a37c46304f158438b7731bdfd84831cf044727d1276ad7cc48aed55e1fcb0797
MD5 72e6d11d09af9da2eb4317753e7c15a6
BLAKE2b-256 4993421b8494cd049a254ecc54540fbb40c4783e12dbaf4a62e3177df26c21ad

See more details on using hashes here.

File details

Details for the file harnesskit-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: harnesskit-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for harnesskit-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aa10888ba1dd9534d153a84cf33818038769b1667faab70e179356bb6d39658c
MD5 ff8cce471c31eef79cc3bfc0116ef364
BLAKE2b-256 14e8cf22104bb3d75079e5a0a7f3f606eb2f5a98752fd6b708d632b2e12165be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page