Autonomous GPU kernel optimizer
Project description
gitm-labs
Behavioral compiler + intervention runtime for GPU-intensive workloads. Given a workload and a time budget, gitm-labs autonomously profiles, attributes, and applies kernel-level interventions to hit a target performance improvement — and produces a provenance report showing exactly what it changed and why.
Install
pip install gitm-labs
NVIDIA GPU support:
pip install "gitm-labs[nvidia]"
Optional extras: bench (HFT/biotech/edge benchmark harness), prometheus, otlp, s3.
Requires: Python 3.10+, NVIDIA (NVML + CUPTI) or AMD (ROCm SMI + rocprof) GPU.
Quick start
export GITM_S3_ROOT="s3://your-bucket/gitm" # durable store for datasets + run outputs
export GITM_SCRATCH="/mnt/nvme/gitm" # local ephemeral run dir (defaults to ~/.cache/gitm)
gitm run --workload hft-lob --budget 24h --target 15%
Workloads: hft-lob (HFT order-book), af2 (AlphaFold2 protein inference), kitti (3D LiDAR detection).
--budget is the wall-clock time limit. --target is the performance improvement fraction gitm-labs commits to delivering, or issues a diagnostic explaining why the floor could not be met.
Verify your environment first:
gitm doctor
Embedded API
from gitm import optimize
result = optimize(engine, budget="24h", target=0.15)
The 24-hour loop
gitm-labs runs a five-phase autonomous loop within the allotted budget:
| Phase | Hours | What happens |
|---|---|---|
| 1. Profile | 0–2 | Capture event + state telemetry; fingerprint workload; build predicted execution graph |
| 2. Attribute | 2–6 | Compute residuals against predicted graph; run causal attribution |
| 3. Rank | 6–12 | Query intervention library; rank candidates via counterfactual replay |
| 4. Apply | 12–20 | Apply top-N interventions with rollback gates |
| 5. Report | 20–24 | Stabilize; write provenance report (claim → evidence → intervention → delta) |
Architecture
gitm-labs separates the empirical half (what happened) from the predicted half (what should have happened). Everything downstream operates on residuals — the difference between the two.
Two telemetry planes
State telemetry (gitm.telemetry)
Point-in-time samples of GPU state at ~1 Hz: utilization, memory, power, clocks, temperature, throttle reasons, NVLink throughput, ECC counters.
Source: NVML (NVIDIA) / ROCm SMI (AMD). Cost: ~microseconds per sample.
Event telemetry (gitm.tracer)
Per-kernel activity records with start/end timestamps, stream IDs, and memory transfer events.
Source: CUPTI (NVIDIA) / rocprof (AMD). Required for kernel-time invariant checks.
Deviation invariants
The monitor checks observed-vs-predicted against three invariants:
- Kernel-time — per-kernel duration must lie within roofline bounds.
- Memory-traffic — per-kernel bytes-moved must match predicted.
- Stream-concurrency — predicted-concurrent kernels must overlap.
See docs/invariants.md.
Module responsibilities
| Module | Responsibility |
|---|---|
gitm.telemetry |
Vendor-backend autodiscovery, NVML/ROCm SMI samples, pluggable sinks |
gitm.tracer |
Event-telemetry capture (CUPTI/rocprof), trace schema, context manager |
gitm.planner |
Behavioral Compiler — roofline-based predicted execution graph |
gitm.optimizer.monitor |
Deviation monitor — residuals against 3 invariants |
gitm.optimizer.attribution |
Granger + doubly-robust on residual subgraph |
gitm.optimizer.replay |
Counterfactual replay for predicted intervention delta |
gitm.optimizer.qualification |
Workload fingerprint gate (commit / diagnose) |
gitm.optimizer.report |
Provenance chain renderer (claim → evidence → intervention → delta) |
gitm.kernels |
Curated intervention library — 15–20 levers with applicability + safety |
gitm.agents |
Autonomous policy — selects interventions, drives rollback |
gitm.scheduler |
24-hour loop phase orchestration |
See docs/ARCHITECTURE.md for the full design.
Data layout
Two environment variables control where data lives:
export GITM_S3_ROOT="s3://your-bucket/gitm" # canonical store (datasets + run outputs)
export GITM_SCRATCH="/mnt/nvme/gitm" # local ephemeral dir (defaults to ~/.cache/gitm)
Layout under $GITM_S3_ROOT:
datasets/{hft,biotech,edge}/ # benchmark inputs (immutable, sha256-pinned)
runs/ # baseline + run outputs
traces/ # captured event-telemetry traces
telemetry/ # state-telemetry samples
Local scratch is ephemeral and synced to S3 after each run.
Primary interfaces
# tracer
with gitm.tracer.capture(out_path: Path) -> ContextManager[Trace]: ...
# planner
graph = gitm.planner.predict_graph(model: ModelSpec, hw: HardwareSpec, batch: BatchConfig) -> Graph
# monitor
residuals = gitm.optimizer.monitor.residuals(trace: Trace, graph: Graph) -> Residuals
violations = gitm.optimizer.monitor.check_invariants(residuals, invariants) -> list[Violation]
# attribution
hypotheses = gitm.optimizer.attribution.attribute(residuals: Residuals, graph: Graph) -> RankedHypotheses
# report
report_md = gitm.optimizer.report.write(claims: list[Claim], provenance: Provenance) -> str
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gitm_labs-0.0.2.tar.gz.
File metadata
- Download URL: gitm_labs-0.0.2.tar.gz
- Upload date:
- Size: 268.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e073ed3b03eb326cf89f5cb8342818e69cebcd6e53c9c5519547e1b90ab96bbc
|
|
| MD5 |
9509c28856b14c154b0e24ce908c771c
|
|
| BLAKE2b-256 |
1ef45ac0d015d0143a3d9e735a03e158161b474c327c9baf3e2bf4edae009270
|
Provenance
The following attestation bundles were made for gitm_labs-0.0.2.tar.gz:
Publisher:
workflow.yml on GitM-Labs/runtime
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gitm_labs-0.0.2.tar.gz -
Subject digest:
e073ed3b03eb326cf89f5cb8342818e69cebcd6e53c9c5519547e1b90ab96bbc - Sigstore transparency entry: 1785873048
- Sigstore integration time:
-
Permalink:
GitM-Labs/runtime@f7d526941fbb9615cd54c1980cc503cf1d3aa057 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/GitM-Labs
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@f7d526941fbb9615cd54c1980cc503cf1d3aa057 -
Trigger Event:
release
-
Statement type:
File details
Details for the file gitm_labs-0.0.2-py3-none-any.whl.
File metadata
- Download URL: gitm_labs-0.0.2-py3-none-any.whl
- Upload date:
- Size: 98.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41f05c91bf7b05ba1d190eb7d6087f6246cf611dc1ad635f3583feb96d4bc1ec
|
|
| MD5 |
2fdad63bf45f2a94113a4acc5b86bacb
|
|
| BLAKE2b-256 |
b93fe83eb854518ba1d59b8035a20cd9f691148b9fe1ba96281e181e61a57920
|
Provenance
The following attestation bundles were made for gitm_labs-0.0.2-py3-none-any.whl:
Publisher:
workflow.yml on GitM-Labs/runtime
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gitm_labs-0.0.2-py3-none-any.whl -
Subject digest:
41f05c91bf7b05ba1d190eb7d6087f6246cf611dc1ad635f3583feb96d4bc1ec - Sigstore transparency entry: 1785873164
- Sigstore integration time:
-
Permalink:
GitM-Labs/runtime@f7d526941fbb9615cd54c1980cc503cf1d3aa057 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/GitM-Labs
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@f7d526941fbb9615cd54c1980cc503cf1d3aa057 -
Trigger Event:
release
-
Statement type: