Architecture-agnostic matching principle: estimate Sigma_task (D1-D7) and train any encoder with matched PMH penalties

These details have not been verified by PyPI

Project links

Project description

matching-pmh

Deployment geometry in. Matched robustness out.
_{Estimate Σ_task (D1–D7) · train any encoder with matched PMH · falsify with controls}

PyPI · GitHub · Walkthroughs · Theory · Integration · Quickstart

matching-pmh is a research-grade PyTorch library for the Matching Principle: name what changes at deployment without changing the label, estimate that nuisance geometry $\Sigma_{\mathrm{task}}$, and add a matched Jacobian penalty on your representations $h=\phi_\theta(x)$—ResNet, ViT, GNN, Whisper-style encoders, causal LMs with LoRA, or frozen features + sklearn.

Design goal: two phases, one hook tensor h, no framework lock-in. The paper’s thirteen task blocks are validation examples; this repo is built so a new lab member can integrate in an afternoon.

30-second start

pip install matching-pmh
python examples/01_domain_shift_d4.py          # minimal PyTorch loop
pmh-train list-methods                         # D1–D7 catalog

import pmh
from pmh import SigmaTaskConfig, PMHConfig, PMHLoss, collect_features, estimate_from_config

# Phase A — estimate (frozen encoder)
artifact = estimate_from_config(SigmaTaskConfig.for_domain(rank=32), h_source, h_target)
artifact.save("artifacts/sigma")

# Phase B — train (your loop)
pmh_loss = PMHLoss(artifact, PMHConfig(weight=0.3, cap_ratio=0.3, warmup_epochs=2))
total, _ = pmh_loss.capped_total(task_loss, h)

→ Full path: docs/QUICKSTART.md · Pick your stack: walkthroughs

Problem → object → repair → unification

Problem. ERM uses every input direction that predicts training labels—including nuisances harmful at deployment (lighting, site, sensor noise, answer formatting, renameable identifiers, …).

Object.

$$ \Sigma_{\mathrm{task}} = \mathrm{Cov}_{Q_n}(n) $$

for label-preserving deployment nuisance $n \sim Q_n$.

Repair. Matched PMH shrinks the encoder Jacobian along $\Sigma_{\mathrm{task}}$, not uniformly (isotropic PMH / generic VAT):

$$ \mathcal{L} = \mathcal{L}{\mathrm{task}} + \lambda ,\mathbb{E}x\left[\mathrm{Tr}\left(J\phi(x)^\top J\phi(x),\Sigma'\right)\right], \quad \mathrm{range}(\Sigma') \supseteq \mathrm{range}(\Sigma_{\mathrm{task}}). $$

Unification. CORAL, domain Grams, augmentation stacks, metric-learning directions, adversarial subspaces, and style Grams are estimators of the same object (D1–D7); matched PMH is one loss with $\Sigma' \approx \hat\Sigma_{\mathrm{task}}$.

How it fits your codebase

 Phase A (once)              Phase B (every step)
 ───────────────              ────────────────────
 source/target data    →      x, y ~ your loader
       ↓                            ↓
 encoder (eval)        →      encoder (train) → h
       ↓                            ↓
 estimate D1–D7        →      L_task(h, y) + PMHLoss(h, Σ̂)
       ↓
 artifact.pt

You keep	Library adds
Model, optimizer, task loss	`SigmaTaskConfig`, `estimate_from_config`
Data loaders	`collect_features` (optional)
Training loop / Trainer	`PMHLoss.capped_total` or `PMHTrainer`

Walkthroughs (16 guides)

#	Guide	Paper block	Run
1	PyTorch + D4	Generic	`examples/01_domain_shift_d4.py`
2	ResNet + D4	Vision	`examples/12_resnet_hook_d4.py`
3	Office-31 + sklearn	T1	`examples/06_office31_sklearn.py`
4	Multi-layer CNN	T2	`examples/07_vision_multilayer.py`
5	Compositional D5	T5	`examples/13_compositional_train_d5.py`
6	LLM style D7	T7A	`examples/08_hf_style_d7.py`
7	HF Trainer + DPO	T7A	`examples/11_dpo_lora_style_pmh.py`
8	Falsification controls	All	`examples/04_falsification_controls.py`
9	CLI JSON jobs	Repro	`pmh-train estimate --config …`
10	Lightning	—	`examples/09_lightning_module.py`
11	Temporal D6	T6B	API in guide
12	ViT / CLS + D4	T2 ViT	`examples/14_vit_cls_d4.py`
13	Speech encoder + D4	T6A	`examples/15_speech_encoder_d4.py`
14	QM9 / molecules D5	T5A	`examples/16_qm9_molecule_d5.py`
15	Code / tokens D5	T5B	`examples/17_code_tokens_d5.py`
16	Augmentations D3	T2 aug	`examples/18_augmentation_d3.py`

Index: docs/walkthroughs/index.md · Example catalog: examples/README.md

Estimators at a glance (D1–D7)

Story	Method	`SigmaTaskConfig`
Domain / site; $P(y\mid x)$ stable	D4	`for_domain(rank=…)`
Low-rank shift + labels	D1	`for_subspace(rank=…)`
Unstructured noise	D2	`for_isotropic(dim, noise_level)`
Known aug modes	D3	`for_augmentation()` + `aug_deltas`
Nuisance coordinates (atoms, tokens)	D5	`for_compositional(indices)`
Temporal drift in window	D6	`for_temporal()`
LLM style vs fixed content	D7	`for_alignment(rank=…)`

pmh-train list-methods

Install

pip install matching-pmh

Extra	Use case
`[vision]`	ResNet / ViT walkthroughs
`[hf]`	D7 style Gram (Transformers)
`[hf-lora]`	LoRA + DPO example
`[sklearn,vision]`	Office-31 pipeline
`[lightning]`	`LightningModule` callback
`[all]`	Development + docs

From source (contributors):

git clone https://github.com/vishalstark512/matching-pmh.git
cd matching-pmh
pip install -e ".[dev,all]"
pytest -q

Documentation

Document	Purpose
QUICKSTART.md	First successful run in 10 minutes
THEORY.md	$\Sigma_{\mathrm{task}}$, recipe, falsification
ARCHITECTURES.md	Hook points per stack
PHILOSOPHY.md	Design principles for integrators
walkthroughs/	End-to-end guides
nuisance_types.md	Data formats
cli.md	`pmh-train` reference

Citation

If you use this software, cite the Grand Unification / Matching Principle manuscript (CITATION.cff).

@software{matching_pmh,
  title  = {matching-pmh: Matched PMH training from estimated deployment nuisance geometry},
  author = {Rajput, Vishal},
  year   = {2026},
  url    = {https://github.com/vishalstark512/matching-pmh}
}

Contributing

We welcome issues, walkthrough improvements, and estimator integrations. See CONTRIBUTING.md.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.5.3

May 20, 2026

1.5.1

May 20, 2026

1.5.0

May 20, 2026

1.4.1

May 19, 2026

1.3.0

May 19, 2026

1.2.0

May 19, 2026

0.8.0

May 19, 2026

0.7.2

May 19, 2026

0.7.1

May 19, 2026

This version

0.7.0

May 19, 2026

0.6.0

May 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matching_pmh-0.7.0.tar.gz (82.0 kB view details)

Uploaded May 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

matching_pmh-0.7.0-py3-none-any.whl (43.9 kB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file matching_pmh-0.7.0.tar.gz.

File metadata

Download URL: matching_pmh-0.7.0.tar.gz
Upload date: May 19, 2026
Size: 82.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for matching_pmh-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`b7488a1335d6da9492fd25ae58ca86bbe6ff4cbebbe1c645ca44745518562834`
MD5	`2b100fe093af6135aee4ee38c8c2ceb6`
BLAKE2b-256	`5671d1bdeb0e53ce577d81363b5ba848df42bbcf5e61f587ea8f6c6e2db7daed`

See more details on using hashes here.

File details

Details for the file matching_pmh-0.7.0-py3-none-any.whl.

File metadata

Download URL: matching_pmh-0.7.0-py3-none-any.whl
Upload date: May 19, 2026
Size: 43.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for matching_pmh-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2e2a2305ada5c4fabcddf1b357a5ac412c052547289ca464e443bbba312b121e`
MD5	`f18e00e56fe592d167256d71762e0895`
BLAKE2b-256	`d3c7b48c50a396cad8392ce4a4be3cbc251dee84408bbeaddee6a06f46dc6a0d`

See more details on using hashes here.

matching-pmh 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

matching-pmh

30-second start

Problem → object → repair → unification

How it fits your codebase

Walkthroughs (16 guides)

Estimators at a glance (D1–D7)

Install

Documentation

Citation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes