Edit‑agnostic robustness evaluation reports for weight edits (InvarLock framework)
Project description
Edit‑agnostic robustness reports for weight edits
Catch silent quality regressions from quantization, pruning, and weight edits before they ship.
Quantizing, pruning, or otherwise editing a model’s weights can silently degrade quality.
InvarLock compares an edited subject checkpoint against a fixed baseline with paired
evaluation windows, enforces the canonical guard chain (invariants → spectral → RMT
→ variance → invariants), and produces a machine-readable evaluation report you can gate
in CI.
Why InvarLock?
- Quality gates for edited checkpoints: catch regressions before deployment.
- Paired statistical evidence: primary metrics with confidence intervals.
- Auditable evidence: deterministic pairing metadata + policy digests in
evaluation.report.json. - CI/CD-friendly: stable exit codes,
--jsonoutputs, and portable “evidence packs”. - Offline-first: network is disabled by default; enable downloads per command.
Who is this for?
- ML engineers shipping edited model checkpoints, including quantized, pruned, fine-tuned, or otherwise weight-modified variants.
- MLOps and platform teams building CI gates, runtime-provenance verification, and reviewable evaluation artifacts.
- Researchers validating weight-edit, compression, and model-comparison methods with reproducible paired evaluation across text and image-text workflows supported here.
How it works
┌───────────────────────┐ ┌────────────────────────────────────────────┐
│ Baseline (checkpoint) │────►│ │
└───────────────────────┘ │ invarlock evaluate │
│ ├─► Paired windows (deterministic) │
┌───────────────────────┐ │ ├─► GuardChain pipeline │
│ Subject (checkpoint) │────►│ │ └─► invariants → spectral → RMT → VE │
└───────────────────────┘ │ └─► Emit: evaluation.report.json │
│ │
└────────────────────────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
✅ PASS ❌ FAIL
(ship) (rollback)
Quick start
The public front door is evaluate -> verify -> report html, but the repo now
splits onboarding by user type:
- Wheel user / reviewer: install
invarlock, inspect an existingevaluation.report.json, and render HTML without cloning the repository. - Evaluator: install
invarlock[hf]when you wantevaluateto load Hugging Face models and emit a fresh evaluation bundle. - Repo maintainer: clone the repo and build the local runtime image when you need maintainer smokes, repo presets, or local container-image iteration.
The default evaluate path runs model-loading commands inside the runtime
container and expects an OCI engine such as podman or docker. Host-side
workflows can opt into --execution-mode host, but the default verification
path below expects a container-backed report with sibling runtime provenance.
# Evaluator path: create a fresh bundle
pip install "invarlock[hf]"
invarlock --version
# Compare baseline vs subject (downloads require explicit network enable)
invarlock evaluate --allow-network \
--baseline gpt2 \
--subject distilgpt2 \
--adapter auto \
--profile ci \
--report-out reports/eval \
--quiet
# Validate the container-backed evaluation report
test -f reports/eval/runtime.manifest.json
invarlock verify --json reports/eval/evaluation.report.json
# Render HTML for sharing
invarlock report html -i reports/eval/evaluation.report.json -o reports/eval/evaluation.html
Wheel-only review path:
pip install invarlock, invarlock doctor,
invarlock verify /path/to/evaluation.report.json, and
invarlock report html -i /path/to/evaluation.report.json -o /path/to/evaluation.html.
Repo maintainers can build the local runtime image once with make runtime-image;
InvarLock automatically prefers invarlock-runtime:local when it is present.
Artifact model:
| Artifact | Produced by | Primary consumers |
|---|---|---|
evaluation.report.json |
invarlock evaluate, invarlock report generate --format report |
invarlock verify, invarlock report html, invarlock report validate, invarlock report explain --evaluation-report, invarlock advanced runtime-verify |
report.json |
Baseline/subject run directories under runs/... |
invarlock report generate, invarlock report explain --subject-report ... --baseline-report ... |
Example output (abridged; counts vary by profile/config):
INVARLOCK v<version> · EVALUATE
Baseline: gpt2 -> Subject: gpt2 · Profile: dev
Status: PASS · Gates: <passed>/<total> passed
Primary metric ratio: <ratio>
Output: reports/eval/evaluation.report.json
Runtime provenance: reports/eval/runtime.manifest.json
Command Surface
- First touch in a fresh install:
invarlock --help,invarlock --version,invarlock report --help, andinvarlock advanced --help. - Core workflow:
invarlock evaluate→invarlock verify→invarlock report html. - Follow-on report analysis after the core loop:
invarlock report generate,invarlock report explain, andinvarlock report validate. - Environment and release checks:
invarlock doctorplus the JSON surfaces emitted bydoctor --jsonandadvanced plugins ... --json. - Runtime-manifest verifier:
invarlock advanced runtime-verify --report <evaluation.report.json> --manifest <runtime.manifest.json>. - The public contract catalog exposed by those JSON surfaces includes
validation_keys,console_labels, andmetric_kinds. - Advanced workflows:
invarlock advanced evidence-pack,invarlock advanced policy,invarlock advanced plugins,invarlock advanced calibrate, andinvarlock advanced runtime-verify. - Host execution for the core evaluate path uses
--execution-mode host. - Optional adapter/backend installs use normal Python extras such as
pip install "invarlock[hf]"rather than CLI install commands.
Evidence packs (portable evidence bundles)
Evidence packs bundle reports + verification metadata into a distributable artifact.
- Guide: https://invarlock.github.io/invarlock/0.8.0/user-guide/evidence-packs/
- Verify from an installed wheel:
invarlock advanced evidence-pack verify <dir> --strict - Repo harness alternative:
scripts/evidence_packs/verify_pack.sh --pack <dir> --strict
Note: configs/ and most scripts/ remain repo resources and are not included in
wheels. Installed wheels include the public contracts and the
invarlock advanced evidence-pack verify verifier, so installed packages can
check bundles without cloning the repository.
Installation
# Minimal CLI (no torch/transformers)
pip install invarlock
# HF workflows (torch/transformers)
pip install "invarlock[hf]"
Optional extras: invarlock[probes], invarlock[gpu], invarlock[awq,gptq].
On Python 3.13+ stacks, gptq may still require a vendor wheel or a
supported older interpreter because upstream auto-gptq packaging is narrower
than the core InvarLock support matrix. Full setup:
https://invarlock.github.io/invarlock/0.8.0/user-guide/getting-started/.
The minimal install covers the core verification and reporting flows. Add
invarlock[hf] only for model-loading evaluate runs, and use the installed
wheel's evidence-pack verifier when you need to inspect a bundle without cloning
the repository.
Documentation
- Docs home: https://invarlock.github.io/invarlock/0.8.0/
- Quickstart: https://invarlock.github.io/invarlock/0.8.0/user-guide/quickstart/
- Compare & evaluate (BYOE): https://invarlock.github.io/invarlock/0.8.0/user-guide/compare-and-evaluate/
- Reading a report: https://invarlock.github.io/invarlock/0.8.0/user-guide/reading-report/
- CLI reference: https://invarlock.github.io/invarlock/0.8.0/reference/cli/
- Assurance case: https://invarlock.github.io/invarlock/0.8.0/assurance/00-assurance-case/
(repo source:
docs/assurance/00-assurance-case.md) - Threat model: https://invarlock.github.io/invarlock/0.8.0/security/threat-model/
Community
- Questions/ideas: https://github.com/invarlock/invarlock/discussions
- Bug reports: https://github.com/invarlock/invarlock/issues
- Contact: mailto:support@invarlock.dev
Citation
If you use InvarLock in scientific work, please cite it (canonical metadata is in CITATION.cff):
@software{invarlock,
title = {InvarLock: Edit-agnostic robustness evaluation reports for weight edits},
author = {{InvarLock}},
url = {https://github.com/invarlock/invarlock},
}
Limitations
- InvarLock evaluates an edited model relative to a baseline under a specific configuration; results are not “global” guarantees.
- Not a content-safety/alignment tool.
- Native Windows is not supported (use WSL2 or Linux).
Support matrix
| Platform | Status | Notes |
|---|---|---|
| Python 3.12+ | ✅ Required | |
| Linux | ✅ Full | Primary dev target |
| macOS (Intel/M-series) | ✅ Full | MPS supported (default on Apple Silicon) |
| Windows | ❌ Not supported | Use WSL2 or a Linux container if required |
| CUDA | ✅ Recommended | For larger models |
| CPU | ✅ Fallback | Slower but functional |
Project status
InvarLock is pre‑1.0. Until 1.0, minor releases may include breaking changes. See CHANGELOG.md.
For guidance on where to ask questions, how to report bugs, and what to expect in terms of response times, see
SUPPORT.md.
Contributing
- Contributing guide: https://github.com/invarlock/invarlock/blob/v0.8.0/CONTRIBUTING.md
- Fast local checks (repo clone):
maketargets auto-select Python 3.12+, preferring an active 3.12 env,python3.12, then the Conda envinvarlock-py312when present.make dev-installmake testmake lintmake docs-live
License
Apache-2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file invarlock-0.8.0.tar.gz.
File metadata
- Download URL: invarlock-0.8.0.tar.gz
- Upload date:
- Size: 616.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d6b09ea0d15059cfa3503e3489e90e9bb258424a95a89cee373e89f6d21ce1a
|
|
| MD5 |
5bcfa3652e085b8ff3c5a190a0e5515c
|
|
| BLAKE2b-256 |
42a8a61bef95b706f39c2ea4b91a9579c7f0ae2e4ce47e69cb5c7cc51d148f51
|
Provenance
The following attestation bundles were made for invarlock-0.8.0.tar.gz:
Publisher:
release.yml on invarlock/invarlock
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
invarlock-0.8.0.tar.gz -
Subject digest:
7d6b09ea0d15059cfa3503e3489e90e9bb258424a95a89cee373e89f6d21ce1a - Sigstore transparency entry: 1367641950
- Sigstore integration time:
-
Permalink:
invarlock/invarlock@9119404143a1cae89004aca922df17b5f06a0bad -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/invarlock
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9119404143a1cae89004aca922df17b5f06a0bad -
Trigger Event:
push
-
Statement type:
File details
Details for the file invarlock-0.8.0-py3-none-any.whl.
File metadata
- Download URL: invarlock-0.8.0-py3-none-any.whl
- Upload date:
- Size: 775.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
321bba29b86d7a65cb312f47d760a14609ebbb9ae39a403e1830e1a0e12ad628
|
|
| MD5 |
c6a396bddea8ddec594bdcee694da77b
|
|
| BLAKE2b-256 |
5e9fb6c94d4569230ac71f2d2ec6a496502e987c0e0bc25e667f2279c42fbb81
|
Provenance
The following attestation bundles were made for invarlock-0.8.0-py3-none-any.whl:
Publisher:
release.yml on invarlock/invarlock
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
invarlock-0.8.0-py3-none-any.whl -
Subject digest:
321bba29b86d7a65cb312f47d760a14609ebbb9ae39a403e1830e1a0e12ad628 - Sigstore transparency entry: 1367641961
- Sigstore integration time:
-
Permalink:
invarlock/invarlock@9119404143a1cae89004aca922df17b5f06a0bad -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/invarlock
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9119404143a1cae89004aca922df17b5f06a0bad -
Trigger Event:
push
-
Statement type: