Skip to main content

Performance regression monitor for ML inference projects

Project description

perf-guard

perf-guard is a zero-intrusion command-line tool that turns the result directories of your existing benchmark / eval scripts into a versioned performance history. Record snapshots, compare against one or many baselines, and plot long-term trends.

Highlights

  • Zero intrusion. Does not run your benchmarks; only parses the output.
  • Config-driven. All metric extraction is declared in perf_guard.yaml.
  • Multi-baseline compare. Compare the current run against baseline, last_week, main, or any tag in one command.
  • Trend charts. Render the whole .perf_history/ as an interactive HTML chart, or print ASCII sparklines in the terminal.
  • CI-ready. Exits with code 1 on regression so pipelines fail automatically.

Install

pip install perf-guard

Quick start

  1. Drop a perf_guard.yaml at the project root:

    results_dir: ./results
    
    metrics:
      - name: total_eval_time
        file: summary.txt
        pattern: "Eval time:\\s+([\\d.]+)s"
        threshold_pct: 5
        direction: lower_is_better
    
      - name: pusht_success_rate
        file: pusht/eval.log
        pattern: "success_rate=([\\d.]+)%"
        threshold_pct: 2
        direction: higher_is_better
    
  2. Record your reference run and tag it:

    perf-guard record results/20260421_054042 --tag baseline
    
  3. After your next run, compare and detect regressions:

    perf-guard compare                               # latest vs baseline
    perf-guard compare --base baseline --base last_week
    perf-guard compare --base main --current latest
    

Commands

Command Purpose
perf-guard record <dir> [--tag …] Extract metrics from a result dir and store a snapshot.
perf-guard compare [--base …] [--current …] Compare current against one or more baselines.
perf-guard list List all recorded snapshots.
perf-guard report <ref> Show full metrics for one snapshot.
perf-guard trend [--ref …] [--metric …] [-o path] [--ascii] Plot metric trends across history.
perf-guard install-hook Install a git post-commit hook.

Ref resolution

Anywhere a <ref> is accepted, the following are valid:

  • latest — the most recently recorded snapshot
  • A user tag (created with record --tag)
  • A dirname (e.g. 20260421_054042)
  • yesterday, last_week, last_month — closest snapshot to that time
  • <N>d_ago or <N>_days_ago — closest snapshot to N days ago
  • latest~<N> — N records before the latest

Multi-baseline compare

perf-guard compare --base baseline --base last_week --current latest

Prints one table per baseline. Exits with 1 if any baseline shows a regression.

Trend chart

# HTML chart (Chart.js via CDN) — default
perf-guard trend -o trend.html

# Restrict to a few snapshots / metrics
perf-guard trend --ref baseline --ref dtype-fix --ref compile-opt \
                 --metric total_eval_time --metric pusht_success_rate

# ASCII sparklines in the terminal
perf-guard trend --ascii

Exit codes

Code Meaning
0 Success, no regressions
1 Config error, ref not found, or regression detected
2 Result directory does not exist

Development

git clone https://github.com/huxie/perf-guard
cd perf-guard
pip install -e .

Build and publish to PyPI:

pip install build twine
python -m build
python -m twine upload dist/*

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

perf_guard-0.2.0.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

perf_guard-0.2.0-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file perf_guard-0.2.0.tar.gz.

File metadata

  • Download URL: perf_guard-0.2.0.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for perf_guard-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0cdd7125da2de124ac4742c4b795b16bff7442038dcd3dfbf8778011788293fa
MD5 05db3124fb7b0ed372369b4118fc17a0
BLAKE2b-256 8ac5ccd02b45324f36ddf1fdcc68aee9bb5455269cf24f17f4b969b5fdf92c68

See more details on using hashes here.

File details

Details for the file perf_guard-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: perf_guard-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for perf_guard-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 917df1d86a9ada27f147266ec782f3b077e6b07c71ff4f73f128ee50e3448fd3
MD5 4e651ba18f90a3078a8f8709ef3bb6b0
BLAKE2b-256 bee252a8162de208789751a8be76b1ec1aca76db40183e24e31d9a98e2568bef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page