Skip to main content

Fuzzy edit tool for LLM coding agents — never fail a str_replace again

Project description

🔧 HarnessKit

Fuzzy edit tool for LLM coding agents — never fail a str_replace again.

License: MIT Python 3.8+ Zero Dependencies


The Problem

Every LLM coding agent has the same Achilles' heel: edit application.

When Claude, GPT, or any model tries to modify code, it generates an old_textnew_text pair. The tool then does an exact string match to find where to apply the change. And it fails. A lot.

  • Whitespace differences — the model adds a space, drops a tab, or normalizes indentation
  • Minor hallucinations — a variable name is slightly off, a comment is paraphrased
  • Format fragility — diffs, patches, and line-number schemes all break in different ways

The result? Up to 50% edit failure rates on non-native models. Every failed edit wastes a tool call, burns tokens on retries, and breaks agent flow.

The Solution

HarnessKit (hk) is a drop-in edit tool that fuzzy-matches the old text before replacing it. It uses a 4-stage matching cascade:

  1. Exact match — zero overhead when the model is precise
  2. Normalized whitespace — catches the most common failure mode
  3. Sequence matchingdifflib.SequenceMatcher with configurable threshold (default 0.8)
  4. Line-by-line fuzzy — finds the best contiguous block match for heavily drifted edits

Every edit returns a confidence score and match type, so your agent knows exactly how the edit was resolved.

Quick Start

pip install harnesskit

Or just copy hk.py into your project — it's a single file, stdlib only.

CLI Usage

# Direct arguments
hk apply --file app.py --old "def hello():\n    print('hi')" --new "def hello():\n    print('hello world')"

# JSON from stdin (perfect for tool_use integration)
echo '{"file": "app.py", "old_text": "def hello():", "new_text": "def greet():"}' | hk apply --stdin

# From a JSON file
hk apply --edit changes.json

# Dry run — see what would change without writing
hk apply --file app.py --old "..." --new "..." --dry-run

JSON Edit Format

{
  "file": "path/to/file.py",
  "old_text": "def hello():\n    print('hi')",
  "new_text": "def hello():\n    print('hello world')"
}

Batch multiple edits:

{
  "edits": [
    {"file": "a.py", "old_text": "...", "new_text": "..."},
    {"file": "b.py", "old_text": "...", "new_text": "..."}
  ]
}

Output

{
  "status": "applied",
  "file": "app.py",
  "match_type": "fuzzy",
  "confidence": 0.92,
  "matched_text": "def hello():\n    print( 'hi' )"
}

Exit Codes

Code Meaning
0 Edit applied successfully
1 No match found
2 Ambiguous — multiple matches

MCP Server

HarnessKit ships an MCP (Model Context Protocol) server for plug-and-play integration with any MCP-compatible agent.

Quick Start

Add to your MCP client config (e.g. Claude Desktop, Cursor, etc.):

{
  "mcpServers": {
    "harnesskit": {
      "command": "python3",
      "args": ["/path/to/hk_mcp.py"]
    }
  }
}

Tools

Tool Description
harnesskit_apply Apply a fuzzy edit to a file
harnesskit_apply_batch Apply multiple edits in one call
harnesskit_match Preview the match without modifying (dry run)

Each tool returns the match type, confidence score, and matched text — giving the agent full visibility into how the edit was resolved.

Example

{
  "name": "harnesskit_apply",
  "arguments": {
    "file": "app.py",
    "old_text": "def hello():\n    print('hi')",
    "new_text": "def hello():\n    print('hello world')",
    "threshold": 0.8
  }
}

Response:

{
  "status": "applied",
  "match_type": "whitespace",
  "confidence": 0.95
}

Integration

HarnessKit is designed to slot into any agent framework as the edit backend:

import subprocess, json

def apply_edit(file, old_text, new_text):
    result = subprocess.run(
        ["hk", "apply", "--stdin"],
        input=json.dumps({"file": file, "old_text": old_text, "new_text": new_text}),
        capture_output=True, text=True
    )
    return json.loads(result.stdout)

Or import directly:

from hk import apply_edit

result = apply_edit("app.py", old_text, new_text, threshold=0.8)

Benchmarks

We tested HarnessKit against 45 realistic edit failure scenarios — the kind that break str_replace and apply_patch in production agent workflows.

Category Exact Match HarnessKit Recovery Rate
Whitespace (tabs/spaces, trailing, indentation, CRLF, nesting) 0/11 11/11 100%
Hallucinations (typos, quotes, types, multi-language) 0/16 16/16 100%
Line Drift (shifted context, extra decorators, renames) 2/5 5/5 100%
Partial Matches (subset of target) 2/2 2/2
Real-World (str_replace failures, docstring diffs) 0/6 6/6 100%
Hard (multi-error combos, brace styles, compression) 0/5 5/5 100%
Total 4/45 (9%) 45/45 (100%) 100%

Exact match succeeds 9% of the time. HarnessKit succeeds 100% of the time. 41 out of 41 failed edits recovered.

Run the benchmarks yourself:

python3 benchmarks/benchmark.py

Design Principles

  • Single file, stdlib only — copy it, vendor it, pip install it. No dependency hell.
  • 419 lines of Python — small enough to audit in one sitting
  • Graceful degradation — exact match when possible, fuzzy only when needed
  • Transparent — every result tells you how it matched and how confident it is
  • Model-agnostic — works with any LLM that can produce old/new text pairs

Configuration

Flag Default Description
--threshold 0.8 Minimum similarity score for fuzzy matching
--dry-run false Preview changes without writing to disk

Development

git clone https://github.com/alexmelges/harnesskit.git
cd harnesskit
python3 -m pytest test_hk.py test_mcp.py -v  # 53 tests

License

MIT — see LICENSE.


Built for the agents that build everything else.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harnesskit-0.2.0.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

harnesskit-0.2.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file harnesskit-0.2.0.tar.gz.

File metadata

  • Download URL: harnesskit-0.2.0.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for harnesskit-0.2.0.tar.gz
Algorithm Hash digest
SHA256 110efd571cbb958b74a0aa1fb7c53352788c2d4871ea1e53fe0d43a48c9b1526
MD5 c64f0de81d5def941e0469dac9b5e419
BLAKE2b-256 1534c2d5e804e9cf79185373f43468a610e0d027d5ec0a44a228c7e89e2867b5

See more details on using hashes here.

File details

Details for the file harnesskit-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: harnesskit-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for harnesskit-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b7fa9cca626463cd171b8888f16afa473564b9b62889cd34d94107fc4b8264a2
MD5 20469afa859ff9a5e52915013d44f084
BLAKE2b-256 171b84f8150a41c11f590e0b57b3dbe7918978e2bf7befa9094eb3607df0e986

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page