Skip to main content

Local MCP server that makes performance-profiler output (NVIDIA ncu, AMD rocprof, Linux perf, Apple Metal) token-efficient for LLM coding agents.

Project description

perfdigest

Local MCP server that makes performance-profiler output token-efficient for LLM coding agents. It sits between the profiler and the agent: reads the report from disk and returns a small, structured, numeric digest the agent can act on, while keeping a pointer back to the raw report for lazy expansion.

It is a translator/router, not a judge — interpretation ("is this kernel memory-bound?") is the model's job. perfdigest only provides efficient, deterministic access to clean numeric metrics across vendors and languages.

Two operations — keep them separate

perfdigest deliberately splits what an agent does with a profiler into two tiers:

  1. Digest (read a report) — universal, available on every machine. A report's origin is irrelevant to digesting it. An NVIDIA .ncu-rep (or its CSV export) captured on a CI/remote GPU runner can be pulled to a Mac and digested there — that is how you do CUDA work on a Mac via CI. Tier-1 is never gated by local hardware, only by whether a reader's dependency imports.
  2. Capture (produce a report) — platform-verified. You cannot run ncu on a Mac, Metal on Linux, or hardware PMU counters under WSL2. platform_capabilities and suggest_profile_command gate this so the agent never spends context on a capture that can't run here, and redirect it to "capture elsewhere, digest here."

Backends (v1.0.0)

Backend format Domain Capture tool Capture OS Digest anywhere
nsight ncu-rep gpu_kernel NVIDIA ncu Linux, Windows needs ncu-report wheel
nsight_csv ncu-csv gpu_kernel (export of ncu) ✅ pure Python
rocm rocprof-csv gpu_kernel AMD rocprof Linux, Windows ✅ pure Python
linux_perf perf-stat-json, perf-report cpu_function Linux perf Linux ✅ pure Python
metal metal-trace gpu_pass Apple xctrace macOS ✅ pure Python

GPU backends share one vocabulary (compute_pct_peak, dram_pct_peak, l2_hit_rate, achieved_occupancy, …); the CPU backend introduces a CPU vocabulary (ipc, cache_miss_rate, llc_miss_rate, branch_mispredict_rate, self_pct, …). Future: Go, Java, and other perf-critical runtimes.

Two load-bearing invariants (read before running)

  1. Suppress profiler stdout — write to a file. e.g. ncu --set full -o report.ncu-rep ./app. If the profiler prints its summary to stdout, that raw table enters the agent's context before perfdigest runs — defeating the entire purpose.
  2. None means "not measured in this export", NEVER zero. A metric the export does not contain is returned as not_available_in_this_export, not a fake 0.0. A genuine 0.0 (e.g. zero branch divergence) is preserved. Silently returning zero = lying to the model = the worst bug this tool can have.

Tools

Tier 1 — digest (any backend, any host):

  • list_kernels(report_ref, format)[{name, index, duration_us, domain}]
  • get_metrics(report_ref, format, kernel, metrics=None) → compact digest (metrics=None → the backend's default core set)
  • expand(report_ref, format, kernel, section) → raw vendor metrics (the safety valve)

Tier 2 — capture advisory (platform-verified):

  • platform_capabilities() → machine identity + can_digest (universal) vs can_capture_here (gated)
  • suggest_profile_command(backend, target) → the correct, platform-aware invocation, or a refusal that redirects to the tier-1 path

format is mandatory — the agent passes what it produced; a path says where a file is, not what format it is.

Install & connect

uvx perfdigest-mcp                      # run from PyPI (downloadable)
uv tool install "perfdigest-mcp[nvidia]"  # + NVIDIA native binary reader (Linux/Windows)

PyPI/install name is perfdigest-mcp; the command and import package are perfdigest (e.g. uvx perfdigest-mcp, import perfdigest).

Claude Code and OpenAI Codex setup (both stdio MCP): see docs/clients.md.

Status

v1.0.0 — multi-backend registry; NVIDIA (native + CSV), AMD HIP, Linux perf (C++/Rust), and Apple Metal adapters; platform capability gating; cross-client config. Validated on the Linux/macOS/Windows CI matrix (pure-Python readers run hardware-free against committed fixtures). Real binary-capture tests are fixture-gated and skip without the device.

Related / similar projects

perfdigest was authored independently; these adjacent MCP servers occupy a nearby space and likely work well for their narrower scope. The differences are the reason perfdigest exists:

  • nsys-mcp (NVIDIA Nsight Systems) — profiles binaries and aggregates trace timeline stats. perfdigest targets Nsight Compute per-kernel counters, is read-only (does not run the profiler), and spans multiple vendors.
  • pprof-analyzer-mcp / Profiler-MCP — Go (and Go/Python/Java) CPU/memory profiles, often rendering flamegraphs. perfdigest is a numeric digester focused on token/context efficiency rather than visualization.

What is distinct here: the multi-backend matrix (NVIDIA + AMD HIP + CPU perf + Metal under one neutral contract), the token-efficiency thesis with the None0.0 honesty rule, and the read/capture split with platform capability gating (digest anywhere, capture only where supported).

License

Apache-2.0 — see LICENSE and NOTICE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

perfdigest_mcp-1.0.0.tar.gz (94.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

perfdigest_mcp-1.0.0-py3-none-any.whl (45.0 kB view details)

Uploaded Python 3

File details

Details for the file perfdigest_mcp-1.0.0.tar.gz.

File metadata

  • Download URL: perfdigest_mcp-1.0.0.tar.gz
  • Upload date:
  • Size: 94.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for perfdigest_mcp-1.0.0.tar.gz
Algorithm Hash digest
SHA256 643ed4cf265d7c6adf7c657d0da5b387f06431ec960cfc1c419d0a239ade73e2
MD5 69b1472b3b0f5a2323f1b7909b92c722
BLAKE2b-256 285ff135026f3108c42642583f45f81aa3f242cddc087b9a74a03c1ddeb857df

See more details on using hashes here.

File details

Details for the file perfdigest_mcp-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: perfdigest_mcp-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 45.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for perfdigest_mcp-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f45b2a71d97c681170c53cdc381b741497246fe383805772f1717105d01a8f23
MD5 afaf65464bfb5b70fae5fba53aee901f
BLAKE2b-256 6351b15a4ea6b4d5acee2c23112594e6617b645cd4b692346be1232d03c9af63

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page