Intrinsic Green Learning: task-conditioned intrinsic-dimensionality discovery via a learned encoder and a multi-scale Green's-function kernel.
Project description
Intrinsic Green Learning
Task-conditioned intrinsic-dimensionality discovery for high-dimensional data. IGL pairs a learned encoder with a multi-scale Green's-function kernel and trains the system end-to-end via Variable Projection with random Matryoshka truncation. The model fits the task and simultaneously reveals how many dimensions the task actually needs — usually far fewer than the ambient input.
Note on the import name. The distribution is
intrinsic-green-learning; the import name isigl. This collides withlibigl; if you need both in the same env, install one of them with a different module name.
Why IGL?
For the same input data, a classifier usually needs fewer latent dimensions than a regressor, which in turn needs fewer than a full autoencoder. IGL discovers this hierarchy automatically:
$$ d_{\text{eff}}(\text{classification}) ;\le; d_{\text{eff}}(\text{regression}) ;\le; d_{\text{eff}}(\text{reconstruction}) $$
The library ships an examples/synthetic/moons_xor.py script that fits
all three estimators on the same data and reports the discovered
dimensions — the hierarchy holds out of the box.
Installation
pip install intrinsic-green-learning
Optional extras:
| Extra | Adds | Use case |
|---|---|---|
[viz] |
matplotlib | Plot dimension curves via igl.viz.plot_dimension_curve. |
[eeg] |
mne + moabb + pyriemann | Future EEG / clinical loaders (placeholder for v0.2). |
[nlp] |
transformers + datasets | Future NLP loaders. |
[elbow] |
kneed | Alternative elbow detector. |
[all] |
all of the above | One-shot install for development. |
Quickstart
The library exposes three sklearn-compatible estimators plus a SPD extension. All accept numpy arrays at the API boundary.
Classification
import numpy as np
import igl
from igl.data import embed_in_high_dim, make_moons
x_2d, y = make_moons(400, noise=0.1, seed=0)
x = embed_in_high_dim(x_2d, target_dim=16, seed=0).numpy()
clf = igl.IGLClassifier(max_dim=8, random_state=0).fit(x, y.numpy())
print(f"accuracy = {clf.score(x, y.numpy()):.3f}")
print(f"discovered d_eff = {clf.effective_dimension_}") # ~ 1 on moons
Regression and reconstruction
from igl.data import make_swiss_roll
x, params = make_swiss_roll(800, seed=0)
x_np = x.numpy(); params_np = params.numpy()
reg = igl.IGLRegressor(max_dim=8, random_state=0).fit(x_np, params_np)
ae = igl.IGLAutoencoder(max_dim=8, random_state=0).fit(x_np)
print(reg.effective_dimension_) # ~ 2 on swiss roll (intrinsic dim)
print(ae.effective_dimension_) # ~ 2 on swiss roll
Cross-task hierarchy check
report = igl.compare_d_eff(
cls=clf.dimension_curve_,
reg=reg.dimension_curve_,
recon=ae.dimension_curve_,
)
print(report.d_effs) # {'cls': 1, 'reg': 2, 'recon': 2}
print(report.hierarchy_holds) # True
SPD / Riemannian extension
For covariance-valued data (EEG, clinical signals, …), igl.spd ships
an AIRM-based reconstruction classifier:
from igl.data import make_spd_dataset
from igl.spd import IGLReconSPDClassifier, LogEigVectorizer
spd, y = make_spd_dataset(400, d=8, n_classes=3, seed=0)
x = LogEigVectorizer().fit(spd.numpy()).transform(spd.numpy())
clf = IGLReconSPDClassifier(
latent_dim=8, max_dim=12,
orthogonality_weight=0.1, # plug-in via the ExtraLoss seam
random_state=0,
).fit(x, y.numpy())
print(clf.effective_dimension_)
Custom training loop
If sklearn's surface is too high-level, use the bare PyTorch entry points directly:
import torch
import igl
module = igl.IGLModule(
input_dim=16, max_dim=8, output_dim=2,
config=igl.IGLConfig(
encoder=igl.EncoderConfig(hidden=(128, 64)), # pyramidal MLP
kernel=igl.KernelConfig(n_anchors=64, operator=igl.OperatorName.GAUSSIAN),
),
)
trainer = igl.MatryoshkaTrainer(
loss=igl.CrossEntropyLoss(n_classes=2),
config=igl.MatryoshkaConfig(epochs=500),
)
history = trainer.fit(module, x_train_t, y_train_t, x_val=x_val_t, y_val=y_val_t)
curve = igl.eval_dimension_curve(module, x_val_t, y_val_t, loss=igl.CrossEntropyLoss(n_classes=2))
print("d_eff =", igl.detect_elbow(curve))
Documentation
Local build:
uv sync --group doc
uv run mkdocs serve
Published at https://hotherio.github.io/intrinsic-green-learning/latest/ after the first release.
Examples
Three runnable scripts under examples/synthetic/:
| Script | Manifold | Tasks | Expected d_eff |
|---|---|---|---|
torus_classification.py |
T² ⊂ R⁴ → R³² | XOR cls + sin/cos reg | ≈ 2 |
moons_xor.py |
Moons ⊂ R² → R¹⁶ | cls + reg + recon | d_cls ≤ d_reg ≤ d_recon |
swiss_roll_recon.py |
Swiss roll ⊂ R³ | autoencoder + reg | ≈ 2 |
Run with python -m examples.synthetic.<name>; outputs land in
results/<name>/<git_short_sha>/. Install [viz] for PNG plots.
Development
uv sync --all-groups
uv run lefthook install
Verify your environment:
uv run pytest # tests + 100% coverage
uv run basedpyright src # strict type check
uv run lefthook run pre-commit --all-files # full pre-commit pass
Conventions
The library follows the Hother Python guidelines under
docs/guidelines/:
- basedpyright strict type checking;
Anyis not allowed in public signatures. __all__exhaustive at every module surface.- Google-style docstrings on every public symbol.
- Single base exception
igl.IGLError, one level deep. - Conventional Commits: commit subjects drive
python-semantic-release(feat:→ minor,fix:/perf:/refactor:→ patch,BREAKING CHANGE:→ major). - String-valued type aliases are
enum.StrEnumclasses with a companionLiteralmirror; public APIs accept either form.
Release process
Releases are fully automated by
python-semantic-release
on every push to main via .github/workflows/semantic-release.yml. See
docs/security.md for the supply-chain posture
(OIDC, sigstore attestations, GPG-signed checksums, pip-audit).
License
MIT. See LICENSE and REUSE.toml.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file intrinsic_green_learning-0.1.0.tar.gz.
File metadata
- Download URL: intrinsic_green_learning-0.1.0.tar.gz
- Upload date:
- Size: 92.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
862942e609ce0bf35084aed4456a84239ddf5a19361f7a90e9f0b8f4cc9285b8
|
|
| MD5 |
8ef1b59a9cff2796f4b50fb0d1717715
|
|
| BLAKE2b-256 |
b1e582567c072ff1cecceab1ae75c35dfacec053c5bae7f4d454bd1f03ff356d
|
File details
Details for the file intrinsic_green_learning-0.1.0-py3-none-any.whl.
File metadata
- Download URL: intrinsic_green_learning-0.1.0-py3-none-any.whl
- Upload date:
- Size: 94.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4560b806d169249a8a76fd1dd8d30baeace89212e353c00dbfa0fb515286ff3d
|
|
| MD5 |
8aad54baafabb6d276225cad4decb81e
|
|
| BLAKE2b-256 |
5474c740b1c087d8e846354895f69779d2170d6054ded96741cebccea2d618db
|