Matching Principle for ML: estimate deployment nuisance geometry (Sigma_task, D1-D7) and train any encoder with matched PMH on your representations

These details have not been verified by PyPI

Project links

Project description

matching-pmh

Deployment geometry in. Matched robustness out.

Estimate Sigma_task (D1–D7) · train any encoder with matched PMH · falsify with controls

PyPI · GitHub · Walkthroughs · Theory · Quickstart

matching-pmh implements the Matching Principle: make your encoder robust along the directions that actually shift between train and deploy—not every input direction that happens to correlate with labels.

Step	What you do	What the library does
1. Nuisance	Name what can change at deployment without changing the label (site, lighting, sensor, format, …).	Registry + `suggest_nuisance` / `nuisance="auto"` to pick an estimator family.
2. Geometry	Collect source/target (or augmentation) batches that expose that variation.	Estimate Σ_task — covariance of label-preserving deployment nuisance — via D1–D7 (shift, augment, sequence, style Gram, …).
3. Training	Keep your task loss; point a hook at representations `h = φ_θ(x)`.	Add matched PMH: shrink Jacobian sensitivity along Σ_task (matched penalty), not isotropic VAT/CORAL-on-weights alone.

Your stack, your hook. ResNet / ViT (timm, torchvision), GNN mean-pool, Whisper-style encoders, causal LMs with LoRA, or frozen features + PMHMatcher / sklearn. Two phases (estimate once → train every step), one tensor h, no framework lock-in.

Theory: definitions, lemmas, and loss forms in docs/THEORY.md (LaTeX-friendly).

Not a paper reproduction kit — adapt your own pipeline.
Start here: Getting started → Choose your setup → Gallery

30-second start

pip install matching-pmh
python examples/01_domain_shift_d4.py
pmh-train list-methods

from pmh import PMHMatcher, PMHTrainer, PMHConfig

# NumPy / sklearn frozen features
matcher = PMHMatcher(nuisance="domain_shift", rank=32).fit(x_source, x_target)

# PyTorch — estimate + train in one call
trainer = PMHTrainer(model, hook="backbone", nuisance="auto", pmh_config=PMHConfig.balanced())
trainer.fit(train_loader, source_batches=src_loader, target_batches=tgt_loader, epochs=20)

Getting started · Choose setup · Benchmarks & TDI · Troubleshooting · 18 walkthroughs

Problem, object, repair, unification


Problem	ERM uses every input direction that predicts labels—including nuisances harmful at deployment (lighting, site, sensor noise, formatting, renameable identifiers, …).
Object	Sigma_task = covariance of label-preserving deployment nuisance `n` (under law Q_n).
Repair	Matched PMH shrinks encoder sensitivity along Sigma_task, not uniformly (isotropic PMH / generic VAT).
Unification	CORAL, domain Grams, augmentation stacks, metric-learning directions, adversarial subspaces, and style Grams are different estimators of the same Sigma_task (Lemma D1–D7).

Matched loss (schematic): L = L_task + lambda * Tr(J_phi^T J_phi Sigma') with range(Sigma') covering range(Sigma_task). Details: THEORY.md.

How it fits your codebase

 Phase A (once)              Phase B (every step)
 ----------------              --------------------
 source/target data    ->      x, y from your loader
       |                            |
 encoder (eval)        ->      encoder (train) -> h
       |                            |
 estimate D1-D7        ->      L_task(h, y) + PMHLoss(h, Sigma_hat)
       |
 artifact.pt

You keep	Library adds
Model, optimizer, task loss	`SigmaTaskConfig`, `estimate_from_config`
Data loaders	`collect_features` (optional)
Training loop / Trainer	`PMHLoss.capped_total` or `PMHTrainer`

Walkthroughs (18 templates)

#	Guide	Run
1	PyTorch + D4	`examples/01_domain_shift_d4.py`
2	ResNet + D4	`examples/12_resnet_hook_d4.py`
3	Office-31 + sklearn	`examples/06_office31_sklearn.py`
4	Multi-layer CNN	`examples/07_vision_multilayer.py`
5	Compositional D5	`examples/13_compositional_train_d5.py`
6	LLM style D7	`examples/08_hf_style_d7.py`
7	HF Trainer + DPO	`examples/11_dpo_lora_style_pmh.py`
8	Falsification controls	`examples/04_falsification_controls.py`
9	CLI JSON jobs	`pmh-train estimate --config ...`
10	Lightning	`examples/09_lightning_module.py`
11	Temporal D6	API in guide
12	ViT / CLS + D4	`examples/14_vit_cls_d4.py`
13	Speech encoder + D4	`examples/15_speech_encoder_d4.py`
14	QM9 / molecules D5	`examples/16_qm9_molecule_d5.py`
15	Code / tokens D5	`examples/17_code_tokens_d5.py`
16	Augmentations D3	`examples/18_augmentation_d3.py`
17	Compare arms on your pipeline	`examples/20_compare_training_arms.py`
18	PMHTrainer quickstart	`examples/01_domain_shift_d4.py`

Estimators at a glance (D1–D7)

Deployment story	Method	`SigmaTaskConfig`
Different site / camera / corpus; P(y given x) stable	D4	`SigmaTaskConfig.for_domain(rank=32)`
Low-rank shift; labels on both domains	D1	`SigmaTaskConfig.for_subspace(rank=32)`
Unstructured sensor / acquisition noise	D2	`SigmaTaskConfig.for_isotropic(dim, noise_level)`
Known augmentation modes (color, blur, crop, …)	D3	`SigmaTaskConfig.for_augmentation()` + `aug_deltas`
Nuisance on specific coordinates (atoms, tokens)	D5	`SigmaTaskConfig.for_compositional(indices)`
Drift along time within a sequence	D6	`SigmaTaskConfig.for_temporal()`
LLM style / format; semantics fixed	D7	`SigmaTaskConfig.for_alignment(rank=32)`

pmh-train list-methods

Hybrid nuisances: estimate separate Sigma matrices and add separate PMHLoss terms.

Install

pip install matching-pmh

Extra	Use case
`pip install "matching-pmh[vision]"`	ResNet / ViT examples
`pip install "matching-pmh[hf]"`	D7 style Gram (Transformers)
`pip install "matching-pmh[hf-lora]"`	LoRA + DPO example
`pip install "matching-pmh[sklearn,vision]"`	Office-31 pipeline
`pip install "matching-pmh[lightning]"`	Lightning callback
`pip install "matching-pmh[all]"`	Development + docs

From source:

git clone https://github.com/vishalstark512/matching-pmh.git
cd matching-pmh && pip install -e ".[dev]" && pytest -q

Documentation

Document	Purpose
GETTING_STARTED.md	Main adoption guide (start here)
CHOOSE_YOUR_SETUP.md	Pick API by stack and data
TROUBLESHOOTING.md	Errors, preflight, hook dim
gallery/	Copy-paste: vision / tabular / NLP
hooks.md	ResNet, timm, HF hooks
ADAPT_YOUR_PIPELINE.md	Integration checklist
walkthroughs/	18 stack-specific tutorials
THEORY.md	Mathematics

Citation

Cite the Grand Unification / Matching Principle manuscript. See CITATION.cff in the repository.

@software{matching_pmh,
  title  = {matching-pmh: Matched PMH training from estimated deployment nuisance geometry},
  author = {Rajput, Vishal},
  year   = {2026},
  url    = {https://github.com/vishalstark512/matching-pmh}
}

Contributing

See CONTRIBUTING.md.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.5.3

May 20, 2026

1.5.1

May 20, 2026

1.5.0

May 20, 2026

1.4.1

May 19, 2026

This version

1.3.0

May 19, 2026

1.2.0

May 19, 2026

0.8.0

May 19, 2026

0.7.2

May 19, 2026

0.7.1

May 19, 2026

0.7.0

May 19, 2026

0.6.0

May 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matching_pmh-1.3.0.tar.gz (126.2 kB view details)

Uploaded May 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

matching_pmh-1.3.0-py3-none-any.whl (77.3 kB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file matching_pmh-1.3.0.tar.gz.

File metadata

Download URL: matching_pmh-1.3.0.tar.gz
Upload date: May 19, 2026
Size: 126.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for matching_pmh-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`a5aa3577b813cb573b809bb94137f8a33b324727186a27178369b5bca7f6e42b`
MD5	`c5af57b9b627192459b564fdeb705865`
BLAKE2b-256	`eea2070db7214b7de0eda93464417c12ef96cda325e2ab230bb59a87a4353933`

See more details on using hashes here.

File details

Details for the file matching_pmh-1.3.0-py3-none-any.whl.

File metadata

Download URL: matching_pmh-1.3.0-py3-none-any.whl
Upload date: May 19, 2026
Size: 77.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for matching_pmh-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a59d3b759a018e6a4c6761514709d6e1ac37b7e86ad6b157639c1186352fab5d`
MD5	`36f95d934c48c8a6b9188b3efabbb715`
BLAKE2b-256	`1fc6a24ca0fcbc43995fb05624179c50d110c801308236fe8c276b36056b6724`

See more details on using hashes here.

matching-pmh 1.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

matching-pmh

30-second start

Problem, object, repair, unification

How it fits your codebase

Walkthroughs (18 templates)

Estimators at a glance (D1–D7)

Install

Documentation

Citation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes