An MCP server for HTTP traffic analysis with value provenance tracing — token-efficient HAR inspection for AI agents.

These details have not been verified by PyPI

Project links

Project description

hartrace

An MCP server for analyzing HTTP traffic captures (HAR files) — built so an AI agent can answer questions about a capture without reading the raw JSON into its context window.

Its distinguishing feature is value provenance tracing: given any token, cookie, id, or payload field, hartrace reconstructs where the value was produced (which response set it) and where it was consumed (which later requests sent it), as a compact timeline. Every other tool — search, inspection, diffing — is built to return small, structured results with hard size caps, so analysis stays cheap regardless of how large the capture is.

load_har("session.har")
trace_value("session", "<csrf token>")
  → set_by:  response #4 body, JSON path data.csrf
  → used_in: request #9 header X-CSRF, request #9 body field token

Why this exists

HAR files are large, deeply nested, and repetitive. The two common ways an AI ends up analyzing them are both bad: writing throwaway extraction scripts every session, or pasting raw HAR JSON into the context window (slow, expensive, and it overflows on anything real). A 100-entry capture can be several megabytes; a single gzipped response can be hundreds of kilobytes.

hartrace moves the extraction and correlation to the server. Tools return only what was asked for, capped. The questions that normally require reading many entries by hand — where did this auth token come from? which request produced this cookie? where is this id reused? — are answered in one call.

Features

Provenance tracing — trace_value follows any value across the capture (responses → requests), reporting JSON paths for body fields. Works on tokens, cookies, ids, headers, and payload fields alike, not just cookies.
Search toolkit — full regex search across URLs, headers, and request/response bodies; header finder; URL/endpoint finder; query-parameter extraction.
Inspection — per-request and per-response retrieval with base64 + gzip/deflate decoding, nested-JSON unwrapping, binary detection, and size caps.
Lifecycle maps — cookie_map and token_map summarize how cookies and high-entropy secrets flow through a session.
Diffing — compare two captures by (method, url, ordinal) so repeated calls to the same endpoint align.
Loading — from a local path or an http(s) URL (with SSRF protection and a size cap).
Safe by construction — every list/search tool paginates with server-clamped limits; secrets are redacted in inspection output; no tool raises to the transport (errors are returned as structured values).

Installation

Requires Python 3.10+.

# from PyPI (once published)
pipx install hartrace

# or zero-install with uv
uvx hartrace

# or from source
git clone https://github.com/rafsanbasunia/hartrace
cd hartrace
pip install -e .

The only runtime dependency is fastmcp.

Quick start with Claude Desktop

Add the server to claude_desktop_config.json:

{
  "mcpServers": {
    "hartrace": {
      "command": "uvx",
      "args": ["hartrace"]
    }
  }
}

Or, running from source:

{
  "mcpServers": {
    "hartrace": {
      "command": "python",
      "args": ["/absolute/path/to/hartrace/har_mcp.py"]
    }
  }
}

Restart Claude Desktop, then talk to it naturally:

"Load ~/captures/login.har and tell me where the CSRF token comes from." "Which requests reuse the session cookie?" "Diff before.har and after.har — what's new?"

Other MCP clients

hartrace is a standard stdio MCP server, so it works in any MCP-capable client — only the config file and wrapper key differ.

Cursor (.cursor/mcp.json) and Windsurf use the same "mcpServers" block shown above.

VS Code (.vscode/mcp.json) uses a "servers" key with an explicit type:

{
  "servers": {
    "hartrace": { "type": "stdio", "command": "uvx", "args": ["hartrace"] }
  }
}

In every case the command/args are identical to the Claude Desktop example.

Tools

hartrace exposes 19 tools. Refer to a loaded capture by the name returned from load_har.

Loading

Tool	Purpose
`list_har_files(dir)`	List `.har` files in a directory to choose from.
`load_har(path)`	Load a capture from a local path. Returns the assigned `name`.
`load_har_url(url)`	Load a capture from an `http(s)` URL (SSRF-guarded, size-capped).
`list_hars()`	List currently loaded captures.
`unload_har(name)`	Drop a capture to free memory.

Inspection

Tool	Purpose
`list_requests(name, filter, …)`	Overview rows: index, method, URL, status, response size.
`get_request(name, index)`	One request, decoded; secrets redacted.
`get_response(name, index, max_chars)`	One response body, decoded (base64/gzip), JSON unwrapped, capped.
`get_headers(name, index)`	Request and response headers for one entry.
`get_query_params(name, index)`	Parsed query string of one request.

Search

Tool	Purpose
`search_regex(name, pattern, scope)`	Regex over `url \| req_headers \| resp_headers \| req_body \| resp_body \| all`.
`find_header(name, header_name)`	Every entry carrying a header, with raw values.
`find_urls(name, pattern)`	Requests whose URL matches a pattern.
`list_endpoints(name, group_by)`	Unique endpoints with call counts.

Provenance

Tool	Purpose
`trace_value(name, value)`	Where a value was set vs. used — the full timeline.
`trace_header(name, header_name)`	Resolve a header's value(s) and trace their origin.
`cookie_map(name)`	Every cookie's set/used lifecycle and attributes.
`token_map(name, all_tokens)`	High-entropy secrets and how they propagate.

Comparison

Tool	Purpose
`diff_hars(a, b)`	Requests unique to each capture; matched by `(method, url, ordinal)`.

Every tool's full description — including argument semantics and a worked example — is available to the model through the MCP protocol.

How provenance tracing works

On first trace, hartrace builds a correlation index over the capture (cached for subsequent calls). For a queried value it separates hits into two sides:

set_by — responses that produced the value: a Set-Cookie header, or a response body (with the JSON path when the value sits inside parseable JSON).
used_in — later requests that sent the value: in a request header, a cookie, or a request body field (again with JSON path where applicable).

An empty set_by means the value was supplied by the client rather than produced by any captured response — for example a pre-existing OAuth token. origin is the earliest producer; timeline is the ordered list of entry indices touched.

Values shorter than four characters are refused, because short strings match everywhere and the result would be noise rather than signal.

Design and safety

stdio transport only. The server communicates over stdin/stdout per the MCP spec. All logging is routed to stderr so it cannot corrupt the protocol stream. There is no web UI, port, or background process.
Bounded output. List and search tools paginate with limit/offset, and limits are clamped server-side (a request for a million rows returns the cap, not a million rows). Response bodies are capped by max_chars. These bounds are what make token usage predictable.
Bounded memory. Captures are rejected above 500 MB or 50,000 entries; nested-JSON decoding is depth- and size-limited.
Redaction. Inspection tools mask sensitive header values and high-entropy secrets as <REDACTED len=N>, using a configurable header list plus a generic Shannon-entropy heuristic — not a vendor-specific token shape. Provenance tools correlate on the real value but display only a redacted preview. (find_header intentionally returns raw values, since its purpose is to extract a value to trace.)
SSRF protection. load_har_url refuses non-http(s) schemes and any host resolving to a private, loopback, link-local, reserved, or multicast address (including cloud metadata endpoints), and enforces the size cap on download.
No exceptions across the boundary. Every tool returns a structured {error: "..."} on failure rather than raising, so the agent always receives an actionable message.

The server contains no vendor-specific logic. Helpers such as nested-JSON unwrapping are generic and apply to any deeply nested response.

Development

pip install -e ".[dev]"
pytest

The suite covers parsing and decoding, redaction and entropy detection, provenance tracing (cookie and non-cookie values, JSON-path resolution), the search toolkit, URL loading and SSRF guards, and the tools driven through the actual FastMCP call path.

Layout:

har_mcp.py      MCP server: tool definitions and the stdio entry point
har_parser.py   Parsing, decoding, redaction, and the in-memory store
provenance.py   Correlation index and the trace / cookie / token tools
search.py       Regex, header, URL, and endpoint search
config.json     Redaction settings (sensitive headers, entropy thresholds)
tests/          pytest suite

Configuration

config.json adjusts redaction without code changes:

{
  "sensitive_headers": ["authorization", "cookie", "set-cookie", "x-csrf-token", "x-api-key"],
  "entropy_min_len": 24,
  "entropy_bits_min": 3.5
}

sensitive_headers are always redacted by name; any other value is redacted if it exceeds entropy_min_len characters and entropy_bits_min bits of Shannon entropy per character.

License

MIT. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hartrace-0.1.0.tar.gz (32.3 kB view details)

Uploaded Jun 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hartrace-0.1.0-py3-none-any.whl (25.7 kB view details)

Uploaded Jun 5, 2026 Python 3

File details

Details for the file hartrace-0.1.0.tar.gz.

File metadata

Download URL: hartrace-0.1.0.tar.gz
Upload date: Jun 5, 2026
Size: 32.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for hartrace-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a7ca5e61f9cb7d56f7325a6ff6da7fc9c81e1b4bc9b16641d8c53b83bc0c7f23`
MD5	`536181a59daa64b3577420ed9664e630`
BLAKE2b-256	`3bd26a93b695bafbf39472697aaa20787fb05c19f2e1750711a215ae8168e79f`

See more details on using hashes here.

File details

Details for the file hartrace-0.1.0-py3-none-any.whl.

File metadata

Download URL: hartrace-0.1.0-py3-none-any.whl
Upload date: Jun 5, 2026
Size: 25.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for hartrace-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3112f4c5bd68470124cf9be7b5b8939b517a070f4d7ab5ebd499f0c6521401d7`
MD5	`a65182a868dc144bd1239bef02ed023e`
BLAKE2b-256	`f7e6d5182002cd3059517f62bc4b3c00708411de58345f446ba6529ee3284388`

See more details on using hashes here.

hartrace 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

hartrace

Why this exists

Features

Installation

Quick start with Claude Desktop

Other MCP clients

Tools

Loading

Inspection

Search

Provenance

Comparison

How provenance tracing works

Design and safety

Development

Configuration

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes