Intrinsic Green Learning: task-conditioned intrinsic-dimensionality discovery via a learned encoder and a multi-scale Green's-function kernel.
Project description
Intrinsic Green Learning
High-dimensional inputs — pixel grids, EEG channels, embedding vectors — almost never use all the dimensions they appear to. The handful that actually matter depends on the question you ask: a binary classifier may need only one or two latent axes, a regressor to a continuous target may need a few more, and a full reconstruction needs whatever dimension the data manifold genuinely has.
Intrinsic Green Learning (IGL) discovers that task-conditioned effective dimension while it fits the model. A learned encoder maps the ambient input to a low-dimensional latent space; a multi-scale Green's-function kernel computes a structured design matrix on that latent space; and Variable Projection with random Matryoshka truncation trains the encoder and reads off the smallest dimension that still solves the task. There's no separate "dimensionality reduction" step and no fixed bottleneck — the dimension you should use falls out of training.
The key difference from PCA, UMAP, t-SNE, or any other purely-geometric
manifold-learning method: the effective dimension IGL reports is a
property of (input, task), not of the input alone. The same dataset
will resolve into different d_eff values for a classifier, a
regressor, and an autoencoder — and the hierarchy
$d_{\text{cls}} \le d_{\text{reg}} \le d_{\text{recon}}$ holds out of
the box.
What ships:
- scikit-learn-compatible estimators (
IGLClassifier,IGLRegressor,IGLAutoencoder) for drop-in use in existing pipelines. - Bare PyTorch building blocks (
IGLModule,GreenKernel,MatryoshkaTrainer, …) for custom training loops, novel kernels, and research extensions. - Spectral formulation (
SpectralKernel+ closed-form Fourier / Chebyshev / Legendre / Hermite / Laguerre bases, plus learned Laplace–Beltrami and user-supplied graph bases) with kernel-agnostic null-space augmentation for operators with non-trivial $\ker(L)$. - Riemannian / SPD extension (
igl.spd) for covariance-valued data — EEG, fMRI-derived connectivity, financial covariances — with an AIRM-based loss plugged in through the sameExtraLossseam used by every other training-time regulariser.
Note on the import name. The distribution is
intrinsic-green-learning; the import name isigl. This collides withlibigl; if you need both in the same env, install one of them under a different module name.
Why IGL?
For the same input data, a classifier usually needs fewer latent dimensions than a regressor, which in turn needs fewer than a full autoencoder. IGL discovers this hierarchy automatically:
$$ d_{\text{eff}}(\text{classification}) ;\le; d_{\text{eff}}(\text{regression}) ;\le; d_{\text{eff}}(\text{reconstruction}) $$
The library ships an examples/synthetic/moons_xor.py script that fits
all three estimators on the same data and reports the discovered
dimensions — the hierarchy holds out of the box.
Installation
pip install intrinsic-green-learning
Optional extras:
| Extra | Adds | Use case |
|---|---|---|
[viz] |
matplotlib | Plot dimension curves via igl.viz.plot_dimension_curve. |
[eeg] |
mne + moabb + pyriemann | Future EEG / clinical loaders (placeholder for v0.2). |
[nlp] |
transformers + datasets | Future NLP loaders. |
[elbow] |
kneed | Alternative elbow detector. |
[all] |
all of the above | One-shot install for development. |
Quickstart
The library exposes three sklearn-compatible estimators plus a SPD extension. All accept numpy arrays at the API boundary.
Classification
import numpy as np
import igl
from igl.data import embed_in_high_dim, make_moons
x_2d, y = make_moons(400, noise=0.1, seed=0)
x = embed_in_high_dim(x_2d, target_dim=16, seed=0).numpy()
clf = igl.IGLClassifier(max_dim=8, random_state=0).fit(x, y.numpy())
print(f"accuracy = {clf.score(x, y.numpy()):.3f}")
print(f"discovered d_eff = {clf.effective_dimension_}") # ~ 1 on moons
Regression and reconstruction
from igl.data import make_swiss_roll
x, params = make_swiss_roll(800, seed=0)
x_np = x.numpy(); params_np = params.numpy()
reg = igl.IGLRegressor(max_dim=8, random_state=0).fit(x_np, params_np)
ae = igl.IGLAutoencoder(max_dim=8, random_state=0).fit(x_np)
print(reg.effective_dimension_) # ~ 2 on swiss roll (intrinsic dim)
print(ae.effective_dimension_) # ~ 2 on swiss roll
Cross-task hierarchy check
report = igl.compare_d_eff(
cls=clf.dimension_curve_,
reg=reg.dimension_curve_,
recon=ae.dimension_curve_,
)
print(report.d_effs) # {'cls': 1, 'reg': 2, 'recon': 2}
print(report.hierarchy_holds) # True
SPD / Riemannian extension
For covariance-valued data (EEG, clinical signals, …), igl.spd ships
an AIRM-based reconstruction classifier:
from igl.data import make_spd_dataset
from igl.spd import IGLReconSPDClassifier, LogEigVectorizer
spd, y = make_spd_dataset(400, d=8, n_classes=3, seed=0)
x = LogEigVectorizer().fit(spd.numpy()).transform(spd.numpy())
clf = IGLReconSPDClassifier(
latent_dim=8, max_dim=12,
orthogonality_weight=0.1, # plug-in via the ExtraLoss seam
random_state=0,
).fit(x, y.numpy())
print(clf.effective_dimension_)
Custom training loop
If sklearn's surface is too high-level, use the bare PyTorch entry points directly:
import torch
import igl
module = igl.IGLModule(
input_dim=16, max_dim=8, output_dim=2,
config=igl.IGLConfig(
encoder=igl.EncoderConfig(hidden=(128, 64)), # pyramidal MLP
kernel=igl.KernelConfig(n_anchors=64, operator=igl.OperatorName.GAUSSIAN),
),
)
trainer = igl.MatryoshkaTrainer(
loss=igl.CrossEntropyLoss(n_classes=2),
config=igl.MatryoshkaConfig(epochs=500),
)
history = trainer.fit(module, x_train_t, y_train_t, x_val=x_val_t, y_val=y_val_t)
curve = igl.eval_dimension_curve(module, x_val_t, y_val_t, loss=igl.CrossEntropyLoss(n_classes=2))
print("d_eff =", igl.detect_elbow(curve))
Documentation
Local build:
uv sync --group doc
uv run mkdocs serve
Published at https://hotherio.github.io/intrinsic-green-learning/latest/ after the first release.
Examples
Three runnable scripts under examples/synthetic/:
| Script | Manifold | Tasks | Expected d_eff |
|---|---|---|---|
torus_classification.py |
T² ⊂ R⁴ → R³² | XOR cls + sin/cos reg | ≈ 2 |
moons_xor.py |
Moons ⊂ R² → R¹⁶ | cls + reg + recon | d_cls ≤ d_reg ≤ d_recon |
swiss_roll_recon.py |
Swiss roll ⊂ R³ | autoencoder + reg | ≈ 2 |
Run with python -m examples.synthetic.<name>; outputs land in
results/<name>/<git_short_sha>/. Install [viz] for PNG plots.
Development
uv sync --all-groups
uv run lefthook install
Verify your environment:
uv run pytest # tests + 100% coverage
uv run basedpyright src # strict type check
uv run lefthook run pre-commit --all-files # full pre-commit pass
Conventions
Style, typing, exceptions, commit-message format, and the rest are
documented in CONTRIBUTING.md and
docs/guidelines/.
Release process
Releases are fully automated by
python-semantic-release
on every push to main via .github/workflows/semantic-release.yml. See
docs/security.md for the supply-chain posture
(OIDC, sigstore attestations, GPG-signed checksums, pip-audit).
Bibliography
If you use IGL in academic work, please cite the paper this library implements:
Quemy, A. (2026). Intrinsic Green's Learning: Supervised Learning on Manifolds via Inverse PDE. ICLR 2026 Workshop on AI and PDE. https://openreview.net/forum?id=Y6RpdS98l8
@inproceedings{quemy2026igl,
title = {{Intrinsic Green's Learning: Supervised Learning on Manifolds via Inverse PDE}},
author = {Quemy, Alexandre},
booktitle = {ICLR 2026 Workshop on AI and PDE},
year = {2026},
month = {3},
url = {https://openreview.net/forum?id=Y6RpdS98l8}
}
For a citation to this exact software version, GitHub's "Cite this
repository" widget reads CITATION.cff; its
preferred-citation block points back to the paper above.
License
MIT. See LICENSE and REUSE.toml.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file intrinsic_green_learning-0.2.7.tar.gz.
File metadata
- Download URL: intrinsic_green_learning-0.2.7.tar.gz
- Upload date:
- Size: 109.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7c49aa35635b9e121742e6c3d24724542450e5ebf38da9ccd563fd11524ffcd7
|
|
| MD5 |
cf7b447eb69c3f0a20c93f7c49e6fca8
|
|
| BLAKE2b-256 |
c856843746dbc2754c8a5bf8c17ba5472e598b359e23eb6e9d632f6ba914ed00
|
Provenance
The following attestation bundles were made for intrinsic_green_learning-0.2.7.tar.gz:
Publisher:
semantic-release.yml on hotherio/intrinsic-green-learning
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
intrinsic_green_learning-0.2.7.tar.gz -
Subject digest:
7c49aa35635b9e121742e6c3d24724542450e5ebf38da9ccd563fd11524ffcd7 - Sigstore transparency entry: 1671907157
- Sigstore integration time:
-
Permalink:
hotherio/intrinsic-green-learning@d0df1dfcd33dcc64767af59702a13a0386a2dd1d -
Branch / Tag:
refs/heads/main - Owner: https://github.com/hotherio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
semantic-release.yml@d0df1dfcd33dcc64767af59702a13a0386a2dd1d -
Trigger Event:
push
-
Statement type:
File details
Details for the file intrinsic_green_learning-0.2.7-py3-none-any.whl.
File metadata
- Download URL: intrinsic_green_learning-0.2.7-py3-none-any.whl
- Upload date:
- Size: 103.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd9aaa18fac21b38dfcfe6fc9190b0b51fd4459ad09a55303d884a3a395e9516
|
|
| MD5 |
24dfa336e78f56bf22500ce8d7d9bdf6
|
|
| BLAKE2b-256 |
1353c2cdc4f07ed595ce9b63bcf10b60f12718740c70eea9cf7eb823096420cf
|
Provenance
The following attestation bundles were made for intrinsic_green_learning-0.2.7-py3-none-any.whl:
Publisher:
semantic-release.yml on hotherio/intrinsic-green-learning
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
intrinsic_green_learning-0.2.7-py3-none-any.whl -
Subject digest:
bd9aaa18fac21b38dfcfe6fc9190b0b51fd4459ad09a55303d884a3a395e9516 - Sigstore transparency entry: 1671907166
- Sigstore integration time:
-
Permalink:
hotherio/intrinsic-green-learning@d0df1dfcd33dcc64767af59702a13a0386a2dd1d -
Branch / Tag:
refs/heads/main - Owner: https://github.com/hotherio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
semantic-release.yml@d0df1dfcd33dcc64767af59702a13a0386a2dd1d -
Trigger Event:
push
-
Statement type: