Skip to main content

NVIDIA GPU inspection and synthetic benchmark CLI for GPUBoost.

Project description

GPUBoost

GPUBoost is a local-first NVIDIA GPU performance assistant for CUDA and PyTorch projects. It inspects hardware, runs lightweight benchmarks, performs static analysis, suggests reviewable patches, tests patches in temporary trial workspaces, prepares dataset/model artifacts, and demonstrates the workflow on small real-world-style examples.

GPUBoost is not affiliated with NVIDIA. It is designed for evidence-based local optimization work, not autonomous production patching.

Current checkpoint version: 0.1.2. The package version source is gpuboost/__init__.py; pyproject.toml reads it through Hatch dynamic versioning.

What GPUBoost Is

  • A Python CLI for GPU/system inspection, synthetic benchmarks, static PyTorch analysis, optimization advice, and review-only patch suggestions.
  • A deterministic agent workflow that can inspect, benchmark, analyze code, plan changes, generate diffs, and optionally test those diffs in a copied trial workspace.
  • A local dataset and model workflow for safe structured features, readiness checks, local training experiments, and advisory-only model artifacts.
  • A demo suite for validating the workflow against lightweight CNN, transformer, and DataLoader examples using synthetic data.

What It Does

  • Detects NVIDIA GPU, CUDA, PyTorch, cuDNN, Tensor Core, CPU, RAM, OS, and Python environment details.
  • Runs CPU-safe benchmark commands, including matrix multiplication, AMP, batch-size sweeps, and DataLoader behavior.
  • Produces rule-based recommendations from benchmark output.
  • Statically analyzes PyTorch scripts without importing or executing user code.
  • Generates unified diffs for selected low-risk changes, such as AMP, DataLoader, inference-mode, and cuDNN benchmark suggestions.
  • Creates trial workspaces where generated diffs can be applied to a temporary copy of a script and syntax-checked.
  • Stores local run history when explicitly requested.
  • Provides local model artifact commands for evaluation, packaging, checking, and advisory prediction.

What It Does Not Do

  • No guaranteed speedup. Results depend on hardware, drivers, thermals, power mode, workload shape, PyTorch/CUDA versions, and background load.
  • No production autonomous patching and no --apply command for original source files.
  • No automatic patch application. Patch suggestions are review-only unless trial mode applies them to a temporary copy.
  • No LLM fine-tuning and no external LLM API calls.
  • No external API dependency for normal local use, tests, or release checks.
  • No model authority over deterministic checks. Model predictions are advisory-only model signals.
  • No automatic benchmark-command before/after execution in the agent.
  • No commitment of raw/generated data or model weights.
  • No CUDA requirement for normal tests.

Quickstart

See Setup for full install notes and Quickstart for the shortest path.

python -m venv .venv
python -m pip install -e ".[dev]"
python -m pip install -e ".[dev,benchmark]"  # optional: benchmark commands
python -m pip install -e ".[dev,model]"      # optional: local model commands
python -m pip install -e ".[dev,all]"        # optional: full Torch-backed set
python -m gpuboost doctor
python -m gpuboost --version
python -m gpuboost --help
python -m gpuboost agent optimize examples/bad_train_sample.txt --json
python -m gpuboost agent optimize examples/bad_train_sample.txt --trial --json
python -m gpuboost model safety-check --json

On Windows PowerShell, activate the environment first if desired:

.\.venv\Scripts\activate

The sample file is intentionally named examples/bad_train_sample.txt so formatters and linters do not treat it as project Python code.

The default install covers inspection, setup checks, static analysis, compare, history, demo discovery, and advisory safety checks. Install the optional benchmark, model, or all extra when you want Torch-backed workflows; those extras include both PyTorch and NumPy while the base install stays lightweight.

Core Workflows

Inspect the local system:

python -m gpuboost doctor
python -m gpuboost doctor --json
python -m gpuboost info
python -m gpuboost info --json

Run benchmarks and recommendations:

python -m gpuboost benchmark --quick
python -m gpuboost benchmark --quick --recommend
python -m gpuboost benchmark --quick --json --recommend

Analyze a PyTorch script:

python -m gpuboost analyze train.py
python -m gpuboost analyze train.py --json
python -m gpuboost analyze train.py --patch

Compare saved benchmark JSON files:

python -m gpuboost compare baseline.json optimized.json
python -m gpuboost compare baseline.json optimized.json --json

Comparison reads existing files only. It does not run workloads, apply patches, or decide that a change is production-ready.

Agent Optimize

agent optimize is the one-shot deterministic workflow:

python -m gpuboost agent optimize
python -m gpuboost agent optimize --json
python -m gpuboost agent optimize train.py --json
python -m gpuboost agent optimize train.py --save-history

Without a script path, the agent inspects the system, runs the quick benchmark, builds advisor recommendations, and reports a summary. With a script path, it runs the lightweight static analysis and patch-planning path, then includes a reviewable diff when safe patch suggestions exist.

JSON output uses schema version agent.optimize.v1. partial statuses are non-fatal and exit with code 0; error exits with code 1.

See Agent CLI for JSON shape, model-artifact behavior, history fields, and CLI examples.

Trial Workspace

Trial mode applies generated patch suggestions only to a copied file in a temporary workspace:

python -m gpuboost agent optimize train.py --trial --json
python -m gpuboost agent optimize train.py --trial --test "pytest"

The original source file is never modified. Syntax checks validate the copied file without importing or running the script. A user-provided test command runs only when --trial --test "<command>" is explicitly passed.

See Trial Workspace.

Dataset And Model Workflow

GPUBoost includes a local structured workflow for dataset readiness, baseline evaluation, small PyTorch MLP training, explicit artifact packaging, and advisory agent integration:

python -m gpuboost model evaluate-baselines --json
python -m gpuboost model train-neural --json
python -m gpuboost model train-neural --save-artifact --json
python -m gpuboost model list-artifacts
python -m gpuboost model check-artifact <manifest_path> --min-test-macro-f1 0.75 --require-beats-baseline
python -m gpuboost agent optimize train.py --model-artifact <manifest_path> --json

The trained model remains advisory only. It cannot apply patches, edit files, override deterministic checks, override trials, replace tests, or replace benchmark evidence. Deterministic GPUBoost checks remain authoritative.

What the model actually predicts. The local model is trained on structured metadata about GPUBoost's own analysis (such as finding counts, recommendation counts, and patch-suggestion counts) together with labels derived from comparing two benchmark JSON files. It does not read your code's runtime behavior and it cannot causally predict whether applying a given patch will speed up your real workload. In practice its signal reflects whether GPUBoost found optimization opportunities, not the measured GPU impact of acting on them. Treat the prediction as a coarse triage hint only, and rely on the trial workspace and before/after benchmarks for any real performance claim. The current model is also trained on a small, partly synthetic dataset, so it should not be expected to generalize to arbitrary user scripts.

Model training uses safe feature extraction and must not train on raw source, raw diffs, stdout, stderr, target-derived verdicts, or comparison labels. Artifacts are saved only when explicitly requested.

See Model Training, Model Interface, and Phase 12 Release Readiness.

Real-World Demo Workflow

GPUBoost includes lightweight realistic demos for validating the surrounding workflow:

python -m gpuboost demo real-world-info --json
python -m gpuboost demo real-world-pairs --json

The demos use synthetic data and are validation/demo coverage, not universal proof of optimization impact. Hardware variability can change results, and demo outcomes should not be generalized to every model or GPU.

Generated demo output is written under ignored generated paths such as data/gpuboost/generated/demo_real_world/.

See Demo Workflow, Real-World Validation, and Phase 14 Validation Summary.

Safety Guarantees

  • GPUBoost does not apply patches automatically.
  • There is no automatic patch application.
  • Reviewable diffs are generated before any trial modification.
  • Trial mode modifies only temporary copies, never the original source file.
  • Static analysis parses user code without importing or executing it.
  • Test commands are opt-in and run the provided program directly (no shell), so shell metacharacters cannot chain or inject additional commands; the named program itself still runs and can execute code.
  • Deterministic checks authoritative: deterministic GPUBoost checks remain authoritative alongside measured benchmark evidence.
  • Model predictions are advisory only.
  • Advisory-only model predictions cannot apply patches, edit files, or approve changes.
  • No external LLM APIs are used.
  • User code is not uploaded anywhere.
  • Local run history stays local unless explicitly exported or contributed.
  • Local run history does not store raw source, raw diffs, trial stdout, or trial stderr by default.
  • Model features do not store raw source, raw diffs, stdout, or stderr.
  • Generated artifacts are ignored by default; generated artifacts ignored by .gitignore include data/gpuboost/generated/, raw intake data, model weights, databases, caches, and local reports.
  • Raw/generated data is not committed.

Limitations

  • Benchmarks are small local signals, not production performance proof.
  • Synthetic demos are validation/demo, not universal proof.
  • CPU-only machines skip CUDA benchmark work instead of failing hard.
  • Local model artifacts are experimental advisory aids, not a production optimizer.
  • The model predicts from GPUBoost's own analysis metadata (finding/recommendation counts), not from your code's runtime behavior; it does not causally predict the real GPU speedup of a patch and reflects whether opportunities were found, not their measured impact.
  • The model is trained on a small, partly synthetic dataset and is not expected to generalize to arbitrary user scripts.
  • GPUBoost does not include a bundled/default trained production model.
  • Before/after validation currently compares saved benchmark JSON files; the agent does not run arbitrary before/after benchmark commands automatically.
  • Power mode, thermal throttling, driver versions, CUDA versions, and workload noise can dominate results.

Docs Index

License And Release Audit

GPUBoost is released under the MIT License; see LICENSE. The release audit docs cover Dependency Review, Security Review, and the Final Project Summary, including generated artifacts, external API requirements, advisory-only model behavior, and ignore rules.

Test And Validation Status

The intended validation commands for this phase are:

python -m ruff check .
python -m pytest

The test suite is designed to run without an NVIDIA GPU. Some runtime benchmark results will be skipped or CPU-safe on systems without CUDA.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpuboost-0.1.2.tar.gz (338.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gpuboost-0.1.2-py3-none-any.whl (201.9 kB view details)

Uploaded Python 3

File details

Details for the file gpuboost-0.1.2.tar.gz.

File metadata

  • Download URL: gpuboost-0.1.2.tar.gz
  • Upload date:
  • Size: 338.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for gpuboost-0.1.2.tar.gz
Algorithm Hash digest
SHA256 7f655d27b83cb76292960c682a23d1d7224d4eba984409703a9002e61473d328
MD5 bad651dd8045709946a85298703bc8aa
BLAKE2b-256 f38383e844dade0313e76a7beca3d40b27d8632e207385b9bd090a5826864e95

See more details on using hashes here.

File details

Details for the file gpuboost-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: gpuboost-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 201.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for gpuboost-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3007162877f197522fbc68e4973d47537ba2c9b9e3ef7f34f208685efd1e56a2
MD5 3fa78f069c0df21181d842d27ad04e74
BLAKE2b-256 a1e6fc513138b1c9d6f2ebe06da8ba7f6ec8ea31bdcf67ea6b41ac7648e0af2c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page