Scientific instrumentation for LLM inference memory trace collection and MRM research
Project description
mrm-trace
A Python research package for collecting, parsing, labelling, and analysing LLM inference memory access traces. Designed as scientific instrumentation for Managed-Retention Memory (MRM) research - it characterises how model weights, KV cache, activations, and runtime allocations are actually accessed during inference.
Primary metrics: retention duration · write-once ratio · read frequency · working set size
Install from PyPI
pip install mrm-trace
Linux (or WSL2) is required for
perf memcollection. See Requirements below.
Requirements
| Requirement | Notes |
|---|---|
| Linux (WSL2 supported) | perf mem requires Linux PMU; WSL2 works |
| Python ≥ 3.11 | Tested on 3.11 and 3.12 |
| sudo / CAP_PERFMON | Required for perf mem collection |
Install
# Clone and set up a virtual environment
git clone https://github.com/DhiSys-AI/MRM-Trace
cd MRM-Trace
python -m venv venv
source venv/bin/activate # Windows WSL: same command
# Install package + test dependencies
pip install -e ".[test]"
# Optional: install matplotlib/seaborn for figures
pip install -e ".[test,plots]"
Quick start
# Validate a config file
mrm-trace validate --config config/default_experiment.yaml
# Preview what a run would do (dry run)
mrm-trace plan --config config/default_experiment.yaml
# Run a full experiment (requires model files + sudo for perf)
mrm-trace run --config config/default_experiment.yaml
Running tests
# Every commit - fast, no I/O
pytest -m unit
# Pre-merge - includes integration tests
pytest -m "unit or integration"
# Before dataset release - scientific correctness checks
pytest -m validity
# Property-based invariant tests (Hypothesis)
pytest tests/property/
# Performance benchmarks (excluded from default run)
pytest -m performance --benchmark-only
# Full suite (excludes slow + performance)
pytest
The test suite has three tiers:
| Tier | Marker | Purpose |
|---|---|---|
| 1 | unit |
Individual functions behave correctly |
| 2 | integration |
Components work together |
| 3 | validity |
Measurements are scientifically sound |
Tier-3 validity tests are the most important: they verify that known synthetic inputs produce
known metric outputs (e.g. a 30s weight retention window must yield retention_p99_s ≈ 30.0).
Output layout
Each run writes to results/<model_id>/<run_id>/:
results/llama-7b/run_20240101_120000/
├── trace.parquet ← labelled memory access trace
├── region_map.parquet ← one row per region (weight, kv_cache, …)
├── kv_block_lifecycle.parquet ← per-block write / read / eviction timestamps
├── metrics.csv ← per-region-type summary (human-readable)
├── metadata.json ← hardware, software, observer effect, run validity
├── manifest.json ← SHA-256 checksums for all files
└── raw/
├── perf.data
├── perf_script.txt
└── memray.bin
Run validity classification
Every run is automatically classified based on observer overhead:
| Class | Criteria |
|---|---|
clean |
observer CPU < 10 %, observer mem < 5 % of target RSS, no throttle, baseline CPU < 15 % |
marginal |
observer CPU < 20 %, observer mem < 15 % of target RSS, ≤ 2 throttle events |
contaminated |
anything worse than marginal |
Contaminated runs are archived but excluded from aggregated metrics and paper figures.
Architecture
mrm_trace/
├── cli.py CLI (typer)
├── api.py Python API (Experiment class)
├── schema_version.py Schema version registry and compatibility checking
├── engines/ llama.cpp / vLLM wrappers
├── collector/ perf mem / memray / process_monitor
├── parser/ perf script + memray parsers → trace.parquet
├── labeller/ symbol + address-range region classification
├── analyser/ retention / write-once / read-freq / working-set / IAI / suitability
├── telemetry/ baseline capture / thermal / observer effect / validity classifier
├── reporter/ CSV + Parquet export / figures / manifest / RunExporter
└── utils/ logging / IDs / file helpers
Key design decisions:
- Streaming parser - generators throughout; never loads full trace into RAM (ADR-2)
- Phase-aware tracing -
weight_load/generation/teardownphases distinguish weight from KV (ADR-3) - Observer effect as mandatory output - every run records overhead and validity class (ADR-4)
- Parquet + zstd - column-oriented, ~3× better compression than gzip (ADR-8)
MRM suitability labels
| Label | Criteria |
|---|---|
high_mrm |
write-once ratio ≥ 0.8 and retention p99 ≥ 10 s |
medium_mrm |
write-once ratio ≥ 0.5 and retention p50 ≥ 1 s |
low_mrm |
everything else |
In practice: model weights → high_mrm, short-lived KV blocks → low_mrm.
Schema versioning
All output files carry a mrm_trace_schema_version in their Parquet metadata.
The version registry is in mrm_trace/schema_version.py. Readers validate
major-version compatibility on load; a major bump is a breaking change.
from mrm_trace.schema_version import check_parquet_schema
check_parquet_schema("results/.../trace.parquet", "trace") # raises on incompatibility
Python API
from mrm_trace.labeller import TraceLabeller
from mrm_trace.analyser import compute_all
from mrm_trace.reporter import RunExporter
# Label a stream of raw trace rows
labeller = TraceLabeller()
labelled = list(labeller.label(raw_rows))
region_map = labeller.region_map() # call after consuming label()
kv_lifecycle = labeller.kv_lifecycle()
# Analyse
import pandas as pd
trace = pd.DataFrame(labelled)
results = compute_all(trace)
# results keys: retention_per_region, retention_summary, write_once,
# read_freq, working_set_per_region, working_set_summary,
# locality_per_region, locality_summary, iai, suitability
# Export a publication-ready run directory
exporter = RunExporter("results/llama-7b/run_001")
exporter.export(trace, region_map, kv_lifecycle, results,
metadata={"run_id": "run_001"}, run_id="run_001")
Collector hierarchy
perf mem- primary; requires Linux PMU + root/sudo; WSL2 supportedmemray- fallback; Python-level allocations; no root neededprocess_monitor- always runs in parallel as coarse baseline (psutil)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mrm_trace-0.1.2.tar.gz.
File metadata
- Download URL: mrm_trace-0.1.2.tar.gz
- Upload date:
- Size: 45.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53bcbecd43d3af3282ed86591e48bfbb01f085b97b05c54536e352816b61940e
|
|
| MD5 |
ae6c012e271ae13e25a3c8ecc3c17ac5
|
|
| BLAKE2b-256 |
6bf38ecb008287f80e63f32262559fd80efbf38d3269db7fa3dcf00bcaf5177a
|
Provenance
The following attestation bundles were made for mrm_trace-0.1.2.tar.gz:
Publisher:
publish.yml on DhiSys-AI/MRM-Trace
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mrm_trace-0.1.2.tar.gz -
Subject digest:
53bcbecd43d3af3282ed86591e48bfbb01f085b97b05c54536e352816b61940e - Sigstore transparency entry: 1615541491
- Sigstore integration time:
-
Permalink:
DhiSys-AI/MRM-Trace@e5b9ab9ebc43837d61dce8645a2b4c7aa657b23d -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/DhiSys-AI
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e5b9ab9ebc43837d61dce8645a2b4c7aa657b23d -
Trigger Event:
release
-
Statement type:
File details
Details for the file mrm_trace-0.1.2-py3-none-any.whl.
File metadata
- Download URL: mrm_trace-0.1.2-py3-none-any.whl
- Upload date:
- Size: 59.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa404c0ad158710b518566a1d91869639f33585a0aaa21fe4baa958ea5149bbd
|
|
| MD5 |
a15e32c53b70d748e3f0ff49bbcc8bda
|
|
| BLAKE2b-256 |
2347112f4a4624f5e76649b2f1fffdb13163dafcb8c13baf53e9ada91551aeb6
|
Provenance
The following attestation bundles were made for mrm_trace-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on DhiSys-AI/MRM-Trace
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mrm_trace-0.1.2-py3-none-any.whl -
Subject digest:
fa404c0ad158710b518566a1d91869639f33585a0aaa21fe4baa958ea5149bbd - Sigstore transparency entry: 1615541513
- Sigstore integration time:
-
Permalink:
DhiSys-AI/MRM-Trace@e5b9ab9ebc43837d61dce8645a2b4c7aa657b23d -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/DhiSys-AI
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e5b9ab9ebc43837d61dce8645a2b4c7aa657b23d -
Trigger Event:
release
-
Statement type: