NVIDIA GPU inspection and synthetic benchmark CLI for GPUBoost.
Project description
GPUBoost
GPUBoost is a local-first NVIDIA GPU performance assistant for CUDA and PyTorch projects. It inspects hardware, runs lightweight benchmarks, performs static analysis, suggests reviewable patches, tests patches in temporary trial workspaces, prepares dataset/model artifacts, and demonstrates the workflow on small real-world-style examples.
GPUBoost is not affiliated with NVIDIA. It is designed for evidence-based local optimization work, not autonomous production patching.
Current checkpoint version: 0.1.2. The package version source is
gpuboost/__init__.py; pyproject.toml reads it through Hatch dynamic
versioning.
What GPUBoost Is
- A Python CLI for GPU/system inspection, synthetic benchmarks, static PyTorch analysis, optimization advice, and review-only patch suggestions.
- A deterministic agent workflow that can inspect, benchmark, analyze code, plan changes, generate diffs, and optionally test those diffs in a copied trial workspace.
- A local dataset and model workflow for safe structured features, readiness checks, local training experiments, and advisory-only model artifacts.
- A demo suite for validating the workflow against lightweight CNN, transformer, and DataLoader examples using synthetic data.
What It Does
- Detects NVIDIA GPU, CUDA, PyTorch, cuDNN, Tensor Core, CPU, RAM, OS, and Python environment details.
- Runs CPU-safe benchmark commands, including matrix multiplication, AMP, batch-size sweeps, and DataLoader behavior.
- Produces rule-based recommendations from benchmark output.
- Statically analyzes PyTorch scripts without importing or executing user code.
- Generates unified diffs for selected low-risk changes, such as AMP, DataLoader, inference-mode, and cuDNN benchmark suggestions.
- Creates trial workspaces where generated diffs can be applied to a temporary copy of a script and syntax-checked.
- Stores local run history when explicitly requested.
- Provides local model artifact commands for evaluation, packaging, checking, and advisory prediction.
What It Does Not Do
- No guaranteed speedup. Results depend on hardware, drivers, thermals, power mode, workload shape, PyTorch/CUDA versions, and background load.
- No production autonomous patching and no
--applycommand for original source files. - No automatic patch application. Patch suggestions are review-only unless trial mode applies them to a temporary copy.
- No LLM fine-tuning and no external LLM API calls.
- No external API dependency for normal local use, tests, or release checks.
- No model authority over deterministic checks. Model predictions are advisory-only model signals.
- No automatic benchmark-command before/after execution in the agent.
- No commitment of raw/generated data or model weights.
- No CUDA requirement for normal tests.
Quickstart
See Setup for full install notes and Quickstart for the shortest path.
python -m venv .venv
python -m pip install -e ".[dev]"
python -m pip install -e ".[dev,benchmark]" # optional: benchmark commands
python -m pip install -e ".[dev,model]" # optional: local model commands
python -m pip install -e ".[dev,all]" # optional: full Torch-backed set
python -m gpuboost doctor
python -m gpuboost --version
python -m gpuboost --help
python -m gpuboost agent optimize examples/bad_train_sample.txt --json
python -m gpuboost agent optimize examples/bad_train_sample.txt --trial --json
python -m gpuboost model safety-check --json
On Windows PowerShell, activate the environment first if desired:
.\.venv\Scripts\activate
The sample file is intentionally named examples/bad_train_sample.txt so
formatters and linters do not treat it as project Python code.
The default install covers inspection, setup checks, static analysis, compare,
history, demo discovery, and advisory safety checks. Install the optional
benchmark, model, or all extra when you want Torch-backed workflows;
those extras include both PyTorch and NumPy while the base install stays
lightweight.
Core Workflows
Inspect the local system:
python -m gpuboost doctor
python -m gpuboost doctor --json
python -m gpuboost info
python -m gpuboost info --json
Run benchmarks and recommendations:
python -m gpuboost benchmark --quick
python -m gpuboost benchmark --quick --recommend
python -m gpuboost benchmark --quick --json --recommend
Analyze a PyTorch script:
python -m gpuboost analyze train.py
python -m gpuboost analyze train.py --json
python -m gpuboost analyze train.py --patch
Compare saved benchmark JSON files:
python -m gpuboost compare baseline.json optimized.json
python -m gpuboost compare baseline.json optimized.json --json
Comparison reads existing files only. It does not run workloads, apply patches, or decide that a change is production-ready.
Agent Optimize
agent optimize is the one-shot deterministic workflow:
python -m gpuboost agent optimize
python -m gpuboost agent optimize --json
python -m gpuboost agent optimize train.py --json
python -m gpuboost agent optimize train.py --save-history
Without a script path, the agent inspects the system, runs the quick benchmark, builds advisor recommendations, and reports a summary. With a script path, it runs the lightweight static analysis and patch-planning path, then includes a reviewable diff when safe patch suggestions exist.
JSON output uses schema version agent.optimize.v1. partial statuses are
non-fatal and exit with code 0; error exits with code 1.
See Agent CLI for JSON shape, model-artifact behavior, history fields, and CLI examples.
Trial Workspace
Trial mode applies generated patch suggestions only to a copied file in a temporary workspace:
python -m gpuboost agent optimize train.py --trial --json
python -m gpuboost agent optimize train.py --trial --test "pytest"
The original source file is never modified. Syntax checks validate the copied
file without importing or running the script. A user-provided test command runs
only when --trial --test "<command>" is explicitly passed.
See Trial Workspace.
Dataset And Model Workflow
GPUBoost includes a local structured workflow for dataset readiness, baseline evaluation, small PyTorch MLP training, explicit artifact packaging, and advisory agent integration:
python -m gpuboost model evaluate-baselines --json
python -m gpuboost model train-neural --json
python -m gpuboost model train-neural --save-artifact --json
python -m gpuboost model list-artifacts
python -m gpuboost model check-artifact <manifest_path> --min-test-macro-f1 0.75 --require-beats-baseline
python -m gpuboost agent optimize train.py --model-artifact <manifest_path> --json
The trained model remains advisory only. It cannot apply patches, edit files, override deterministic checks, override trials, replace tests, or replace benchmark evidence. Deterministic GPUBoost checks remain authoritative.
What the model actually predicts. The local model is trained on structured metadata about GPUBoost's own analysis (such as finding counts, recommendation counts, and patch-suggestion counts) together with labels derived from comparing two benchmark JSON files. It does not read your code's runtime behavior and it cannot causally predict whether applying a given patch will speed up your real workload. In practice its signal reflects whether GPUBoost found optimization opportunities, not the measured GPU impact of acting on them. Treat the prediction as a coarse triage hint only, and rely on the trial workspace and before/after benchmarks for any real performance claim. The current model is also trained on a small, partly synthetic dataset, so it should not be expected to generalize to arbitrary user scripts.
Model training uses safe feature extraction and must not train on raw source, raw diffs, stdout, stderr, target-derived verdicts, or comparison labels. Artifacts are saved only when explicitly requested.
See Model Training, Model Interface, and Phase 12 Release Readiness.
Real-World Demo Workflow
GPUBoost includes lightweight realistic demos for validating the surrounding workflow:
python -m gpuboost demo real-world-info --json
python -m gpuboost demo real-world-pairs --json
The demos use synthetic data and are validation/demo coverage, not universal proof of optimization impact. Hardware variability can change results, and demo outcomes should not be generalized to every model or GPU.
Generated demo output is written under ignored generated paths such as
data/gpuboost/generated/demo_real_world/.
See Demo Workflow, Real-World Validation, and Phase 14 Validation Summary.
Safety Guarantees
- GPUBoost does not apply patches automatically.
- There is no automatic patch application.
- Reviewable diffs are generated before any trial modification.
- Trial mode modifies only temporary copies, never the original source file.
- Static analysis parses user code without importing or executing it.
- Test commands are opt-in and run the provided program directly (no shell), so shell metacharacters cannot chain or inject additional commands; the named program itself still runs and can execute code.
- Deterministic checks authoritative: deterministic GPUBoost checks remain authoritative alongside measured benchmark evidence.
- Model predictions are advisory only.
- Advisory-only model predictions cannot apply patches, edit files, or approve changes.
- No external LLM APIs are used.
- User code is not uploaded anywhere.
- Local run history stays local unless explicitly exported or contributed.
- Local run history does not store raw source, raw diffs, trial stdout, or trial stderr by default.
- Model features do not store raw source, raw diffs, stdout, or stderr.
- Generated artifacts are ignored by default; generated artifacts ignored by
.gitignoreincludedata/gpuboost/generated/, raw intake data, model weights, databases, caches, and local reports. - Raw/generated data is not committed.
Limitations
- Benchmarks are small local signals, not production performance proof.
- Synthetic demos are validation/demo, not universal proof.
- CPU-only machines skip CUDA benchmark work instead of failing hard.
- Local model artifacts are experimental advisory aids, not a production optimizer.
- The model predicts from GPUBoost's own analysis metadata (finding/recommendation counts), not from your code's runtime behavior; it does not causally predict the real GPU speedup of a patch and reflects whether opportunities were found, not their measured impact.
- The model is trained on a small, partly synthetic dataset and is not expected to generalize to arbitrary user scripts.
- GPUBoost does not include a bundled/default trained production model.
- Before/after validation currently compares saved benchmark JSON files; the agent does not run arbitrary before/after benchmark commands automatically.
- Power mode, thermal throttling, driver versions, CUDA versions, and workload noise can dominate results.
Docs Index
- Setup
- Quickstart
- Agent CLI
- Agent Core
- Trial Workspace
- Comparison
- Local History
- Model Interface
- Model Training
- Dependency Review
- Security Review
- Final Project Summary
- Demo Workflow
- Real-World Validation
- Demo Report Template
- Phase 12 Release Readiness
- Phase 13 Testing
- Phase 13 Release Readiness
- Phase 14 Validation Summary
- Release Notes
- Release Checklist
- Contributing
- Security Policy
License And Release Audit
GPUBoost is released under the MIT License; see LICENSE. The release audit docs cover Dependency Review, Security Review, and the Final Project Summary, including generated artifacts, external API requirements, advisory-only model behavior, and ignore rules.
Test And Validation Status
The intended validation commands for this phase are:
python -m ruff check .
python -m pytest
The test suite is designed to run without an NVIDIA GPU. Some runtime benchmark results will be skipped or CPU-safe on systems without CUDA.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gpuboost-0.1.2.tar.gz.
File metadata
- Download URL: gpuboost-0.1.2.tar.gz
- Upload date:
- Size: 338.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f655d27b83cb76292960c682a23d1d7224d4eba984409703a9002e61473d328
|
|
| MD5 |
bad651dd8045709946a85298703bc8aa
|
|
| BLAKE2b-256 |
f38383e844dade0313e76a7beca3d40b27d8632e207385b9bd090a5826864e95
|
File details
Details for the file gpuboost-0.1.2-py3-none-any.whl.
File metadata
- Download URL: gpuboost-0.1.2-py3-none-any.whl
- Upload date:
- Size: 201.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3007162877f197522fbc68e4973d47537ba2c9b9e3ef7f34f208685efd1e56a2
|
|
| MD5 |
3fa78f069c0df21181d842d27ad04e74
|
|
| BLAKE2b-256 |
a1e6fc513138b1c9d6f2ebe06da8ba7f6ec8ea31bdcf67ea6b41ac7648e0af2c
|