Behavioral compiler and intervention runtime for vLLM decode workloads.

These details have not been verified by PyPI

Project description

runtime

Behavioral compiler + intervention runtime for vLLM decode workloads.

pip install -e .
gitm run --workload vllm-decode --budget 24h --target 15%

Embedded:

from gitm import optimize
optimize(engine, budget="24h", target=0.15)

Data layout

Two roots — set both before running anything:

export GITM_S3_ROOT="s3://gitm-data/prod"    # canonical store (datasets + archives)
export GITM_SCRATCH="/mnt/nvme/gitm"         # local ephemeral run dir (defaults to ~/.cache/gitm)

Canonical layout under $GITM_S3_ROOT (S3):

datasets/{hft,biotech,edge}/    # benchmark inputs (immutable, sha256-pinned)
runs/                            # durable baseline + pilot outputs
traces/                          # captured event-telemetry traces
telemetry/                       # state-telemetry samples (1Hz GPU state)

Local layout under $GITM_SCRATCH (ephemeral, synced to S3 after a run):

staging/    # datasets staged in from S3 for the active run, then evicted
runs/       # this run's outputs (small) before archival
traces/     # this run's trace before archival
telemetry/  # this run's samples before archival

Architecture

See docs/ARCHITECTURE.md. The runtime is structured as seven subpackages mirroring the data flow:

gitm/
  telemetry/   # state telemetry: NVML / ROCm SMI, 1Hz GPU state samples
  tracer/      # event telemetry: CUPTI / rocprof per-kernel records
  planner/     # behavioral compiler: predicted execution graph (roofline)
  optimizer/   # deviation monitor, attribution, replay, qualification, report
  kernels/     # curated intervention library (the levers)
  scheduler/   # 24-hour loop phase orchestration
  agents/      # autonomous decision policy (intervention selection)

Invariants

The deviation monitor checks observed-vs-predicted against three invariants:

Kernel-time invariant — per-kernel duration must lie within roofline.
Memory-traffic invariant — per-kernel bytes-moved must match predicted.
Stream-concurrency invariant — predicted-concurrent kernels must overlap.

See docs/invariants.md.

The 24-hour loop

Phase	Hours	Module
1. Capture trace, fingerprint workload, predict graph	0–2	`tracer`, `telemetry`, `planner`
2. Compute residuals + causal attribution	2–6	`optimizer.monitor`, `optimizer.attribution`
3. Query library, rank via counterfactual replay	6–12	`kernels`, `optimizer.replay`
4. Apply top-N interventions with rollback gates	12–20	`agents`, `optimizer`
5. Stabilize, write provenance report	20–24	`optimizer.report`

Architecture

GITM separates the empirical half (what happened) from the predicted half (what should have happened). Everything downstream operates on residuals — the difference between the two.

Two telemetry planes

GITM never conflates these.

State telemetry (`gitm.telemetry`)

Point-in-time samples of GPU state at ~1 Hz:

Utilization, memory used, power, clocks, temperature
Throttle reasons (canonical bitmask across vendors)
NVLink throughput, ECC counters

Source: NVML on NVIDIA, ROCm SMI on AMD. Cost: ~microseconds per sample. Shape: summary, not trace.

Event telemetry (`gitm.tracer`)

Per-kernel activity records with start/end timestamps, stream IDs, memory transfer events.

Source: CUPTI on NVIDIA, rocprof on AMD. Cost: per-kernel callbacks. Shape: structurally a trace — required for the kernel-time invariant.

Components

arch

Module responsibilities

Module	Responsibility
`gitm.telemetry`	Vendor-backend autodiscovery, NVML/ROCm SMI samples, pluggable sinks
`gitm.tracer`	Event-telemetry capture (CUPTI/rocprof), trace schema, context manager
`gitm.planner`	Behavioral Compiler — roofline-based predicted execution graph
`gitm.optimizer.monitor`	Deviation monitor — residuals against 3 invariants
`gitm.optimizer.attribution`	Granger + doubly-robust on residual subgraph
`gitm.optimizer.replay`	Counterfactual replay for predicted intervention delta
`gitm.optimizer.qualification`	Workload fingerprint gate (commit / diagnose)
`gitm.optimizer.report`	Provenance chain renderer (claim → evidence → intervention → delta)
`gitm.kernels`	Curated intervention library — 15–20 levers with applicability + safety
`gitm.agents`	Autonomous policy — selects interventions, drives rollback
`gitm.scheduler`	24-hour loop phase orchestration

Interfaces are contracts

The five primary interfaces below are the load-bearing contracts. W2 swarm extends behind these without rewriting upstream code.

# tracer
with gitm.tracer.capture(out_path: Path) -> ContextManager[Trace]: ...

# planner
graph = gitm.planner.predict_graph(model: ModelSpec, hw: HardwareSpec, batch: BatchConfig) -> Graph

# monitor
residuals = gitm.optimizer.monitor.residuals(trace: Trace, graph: Graph) -> Residuals
violations = gitm.optimizer.monitor.check_invariants(residuals, invariants) -> list[Violation]

# attribution
hypotheses = gitm.optimizer.attribution.attribute(residuals: Residuals, graph: Graph) -> RankedHypotheses

# report
report_md = gitm.optimizer.report.write(claims: list[Claim], provenance: Provenance) -> str

Onboarding

This document is load-bearing for Day 6 — the six benchmark interns rotate onto the skeleton using these steps. Every command here is expected to work on a clean checkout.

1. Environment

git clone git@github.com:gitm-labs/runtime.git
cd runtime
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

NVIDIA box additionally:

pip install -e ".[nvidia]"

Point at the canonical S3 store and a local scratch dir (see Data layout — datasets live in S3, never on local disk):

export GITM_S3_ROOT="s3://gitm-data/prod"    # canonical store
export GITM_SCRATCH="/mnt/nvme/gitm"         # local scratch (defaults to ~/.cache/gitm)

gitm doctor reports both, plus discovered GPUs. Scratch subdirs are created on first run.

2. Smoke test

gitm --help
gitm run --help
pytest -q

All three should pass on a clean checkout.

3. The 24-hour loop

The CLI entry point composes five subpackages in order. Read the source in this order — it mirrors the data flow:

gitm/telemetry — state telemetry (1 Hz GPU samples)
gitm/tracer — event telemetry (per-kernel records)
gitm/planner — Behavioral Compiler (predicted graph)
gitm/optimizer — monitor, attribution, replay, report
gitm/kernels — intervention library
gitm/agents — selection policy
gitm/scheduler — phase orchestration

Building a runtime system gitm-labs/runtime, runtime/scheduler/, runtime/tracer/, runtime/optimizer/, runtime/kernels/, runtime/planner/, runtime/telemetry/, runtime/agents/

4. Where things live

Concern	Path
Code	`gitm/`
Tests	`tests/`
Docs	`docs/`
Datasets, traces, runs	`$GITM_S3_ROOT/` (S3, canonical) · `$GITM_SCRATCH/` (local, ephemeral)
Intervention library	`gitm/kernels/library.yaml`
Report template	`gitm/optimizer/templates/report.md.j2`
Trace schema	`gitm/tracer/schema.py` (pydantic)
Telemetry schema	`gitm/telemetry/schema.py` (pydantic)

5. Contributing a new component

Every new module hangs off one of the seven subpackages and exposes its public surface through __init__.py. The five primary interfaces in ARCHITECTURE.md are contracts — extend behind them, do not change them, without Adit's sign-off.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.0.2

Jun 11, 2026

This version

0.0.1

Jun 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gitm_labs-0.0.1.tar.gz (263.5 kB view details)

Uploaded Jun 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gitm_labs-0.0.1-py3-none-any.whl (96.5 kB view details)

Uploaded Jun 10, 2026 Python 3

File details

Details for the file gitm_labs-0.0.1.tar.gz.

File metadata

Download URL: gitm_labs-0.0.1.tar.gz
Upload date: Jun 10, 2026
Size: 263.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gitm_labs-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`3b6e1e6f8a59dc38e875869e36ac24b223590f8bf19e296a183c6a4ea7931804`
MD5	`8e909c1c1fe51819acb3e65d3f1ed575`
BLAKE2b-256	`61014d0700d9e1fefc1389cac5450aa2232f91857773507486661f46e0634084`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gitm_labs-0.0.1.tar.gz:

Publisher: workflow.yml on GitM-Labs/runtime

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gitm_labs-0.0.1.tar.gz
- Subject digest: 3b6e1e6f8a59dc38e875869e36ac24b223590f8bf19e296a183c6a4ea7931804
- Sigstore transparency entry: 1783561297
- Sigstore integration time: Jun 10, 2026
Source repository:
- Permalink: GitM-Labs/runtime@54b2f7cc8c2a8e4dca28379504be64d047b421df
- Branch / Tag: refs/heads/main
- Owner: https://github.com/GitM-Labs
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: workflow.yml@54b2f7cc8c2a8e4dca28379504be64d047b421df
- Trigger Event: workflow_dispatch

File details

Details for the file gitm_labs-0.0.1-py3-none-any.whl.

File metadata

Download URL: gitm_labs-0.0.1-py3-none-any.whl
Upload date: Jun 10, 2026
Size: 96.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gitm_labs-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ed825e770c38d2f6171401f15032dcc94cf8d1bed6f4816464c07a5d7e2d22b3`
MD5	`1f8c83f6ee2ae912cbe7458a176409c8`
BLAKE2b-256	`321b75ad3b97a77d3ab79e18c8abe9fbf5cb514bbe5e8c109f5b4086f3d9bf57`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gitm_labs-0.0.1-py3-none-any.whl:

Publisher: workflow.yml on GitM-Labs/runtime

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gitm_labs-0.0.1-py3-none-any.whl
- Subject digest: ed825e770c38d2f6171401f15032dcc94cf8d1bed6f4816464c07a5d7e2d22b3
- Sigstore transparency entry: 1783561613
- Sigstore integration time: Jun 10, 2026
Source repository:
- Permalink: GitM-Labs/runtime@54b2f7cc8c2a8e4dca28379504be64d047b421df
- Branch / Tag: refs/heads/main
- Owner: https://github.com/GitM-Labs
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: workflow.yml@54b2f7cc8c2a8e4dca28379504be64d047b421df
- Trigger Event: workflow_dispatch

gitm-labs 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

runtime

Data layout

Architecture

Invariants

The 24-hour loop

Architecture

Two telemetry planes

State telemetry (gitm.telemetry)

Event telemetry (gitm.tracer)

Components

Module responsibilities

Interfaces are contracts

Onboarding

1. Environment

2. Smoke test

3. The 24-hour loop

4. Where things live

5. Contributing a new component

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

State telemetry (`gitm.telemetry`)

Event telemetry (`gitm.tracer`)