Machine-learning emulator pipeline for the kinetic Sunyaev-Zel'dovich angular power spectrum.
Project description
reionemu
A modular Python package for building machine-learning emulators of the kinetic Sunyaev-Zel'dovich (kSZ) angular power spectrum from kSZ 2LPT reionization simulations. It includes tools to condense simulation outputs, compute flat-sky power spectra, assemble training datasets, train neural networks that predict binned rescaled kSZ power spectra from reionization parameters, and save lightweight experiment artifacts for reproducibility.
The goal is to learn a fast surrogate model that maps reionization parameters → binned kSZ power spectrum, enabling rapid exploration of cosmological parameter space without re-running expensive simulations.
Installation
pip install reionemu
Or from source (editable):
git clone https://github.com/RobertxPearce/reionization-emulator.git
cd reionization-emulator
python -m pip install -e .
Requirements: Python 3.10+, NumPy, HDF5, PyTorch, and Ray Tune.
Quick start
After installing, you can load a processed HDF5 training dataset, create dataloaders, and train the baseline deterministic 4-parameter emulator:
from pathlib import Path
import torch
import reionemu
# Path to a condensed HDF5 that already has /training (X, Y, ell)
h5_path = Path("path/to/condensed.h5")
# Dataloaders with train/val split and optional normalization
loaders, normalizers, ell = reionemu.make_dataloaders(
h5_path,
split={"train": 0.8, "val": 0.2},
config=reionemu.DataLoaderConfig(batch_size=32, seed=42),
)
# Baseline 4-parameter model, optimizer, loss
model = reionemu.FourParamEmulator()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
loss_fn = torch.nn.MSELoss()
# Train for a few epochs
history = reionemu.fit(
model,
loaders["train"],
loaders["val"],
optimizer,
loss_fn,
config=reionemu.FitConfig(epochs=10, device="cpu"),
)
# Validation loss per epoch
print(history["val_loss"])
# Save a lightweight experiment artifact
artifact_dir = reionemu.save_artifact(
"baseline_four_param",
Path("artifacts"),
dataset_path=h5_path,
dataloader_config=reionemu.DataLoaderConfig(batch_size=32, seed=42),
fit_config=reionemu.FitConfig(epochs=10, device="cpu"),
model_config={
"class_name": "FourParamEmulator",
"input_dim": 4,
"output_dim": 5,
},
optimizer_config={"name": "Adam", "lr": 1e-3},
history=history,
normalizers=normalizers,
checkpoint=model.state_dict(),
)
For MC-dropout experiments, use MCDropoutEmulator with the MC evaluation path:
model = reionemu.MCDropoutEmulator(dropout_rate=0.2)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
history = reionemu.fit(
model,
loaders["train"],
loaders["val"],
optimizer,
torch.nn.MSELoss(),
config=reionemu.FitConfig(epochs=10, device="cpu"),
evaluation="evaluate_mc_metrics",
n_mc_samples=50,
)
print(history["val_mean_predictive_std"])
If you want to tune the four-parameter architecture with Ray Tune before training a final model, you can work directly with the loaded arrays:
from pathlib import Path
import reionemu
from ray import tune
h5_path = Path("path/to/condensed.h5")
X, Y, ell = reionemu.load_training_arrays(h5_path)
split_idx = int(0.8 * len(X))
X_train, X_val = X[:split_idx], X[split_idx:]
Y_train, Y_val = Y[:split_idx], Y[split_idx:]
param_space = {
"hidden_dim": tune.choice([20, 32, 64]),
"num_hidden_layers": tune.choice([1, 2, 3]),
"activation": tune.choice(["relu", "silu", "tanh"]),
"optimizer": tune.choice(["adam", "adamw"]),
"lr": tune.loguniform(3e-4, 2e-3),
"weight_decay": tune.loguniform(1e-8, 1e-4),
"batch_size": tune.choice([16, 32, 64]),
"epochs": 150,
"early_stopping_patience": tune.choice([10, 15]),
"gradient_clipping": tune.choice([None, 0.5, 1.0]),
"normalize_X": True,
"normalize_Y": False,
}
results = reionemu.run_tune_four_param(
X_train=X_train,
Y_train=Y_train,
X_val=X_val,
Y_val=Y_val,
param_space=param_space,
num_samples=20,
max_concurrent_trials=2,
device="cpu",
storage_path="ray_results",
experiment_name="four_param_search",
)
best = results.get_best_result(metric="val_loss", mode="min")
print(best.config)
print(best.metrics["best_val_loss"])
For a full pipeline example (condense → compute power spectra → build training data → tune/train/evaluate), scientific context, and complete usage examples, see the full documentation: Homepage
Scientific context
The kinetic Sunyaev-Zel'dovich (kSZ) effect arises from the scattering of CMB photons by free electrons with bulk motion, generating secondary temperature anisotropies. The kSZ angular power spectrum carries information about the timing, duration, and structure of reionization. This emulator provides a fast surrogate that maps reionization parameters (zmean_zre, alpha_zre, kb_zre, b0_zre) to binned, rescaled kSZ power spectra, making parameter-space exploration much faster than rerunning the full simulations.
Repository structure
| Path | Description |
|---|---|
src/reionemu/ |
Core library (pip-installable package) |
src/reionemu/simio/ |
Simulation I/O, power spectrum computation, training-array building |
src/reionemu/data/ |
Dataloaders, normalization |
src/reionemu/artifact/ |
JSON experiment manifests, config/results saving, normalizer and checkpoint sidecars |
src/reionemu/models/ |
Baseline and experimental emulator architectures |
src/reionemu/training/ |
Training loop, K-fold cross-validation, metrics, and model builders |
src/reionemu/tuning/ |
Ray Tune integration for hyperparameter search |
scripts/ |
Dataset builder, HPC runners, sampling (environment-specific) |
notebooks/ |
Analysis and training examples |
docs/ |
Documentation source code |
datasets/ |
Raw and processed datasets (not tracked) |
results/ |
Visualizations for simulation checks, parameter-space validation, and model evaluation |
The core API is in src/reionemu/. Scripts under scripts/hpc/ and scripts/sampling/ are for cluster and sampling workflows and may use machine-specific paths; the library itself is portable.
Main public API
Import from the top-level package after pip install reionemu:
- Simulation I/O:
condense_sim_root,CondenseConfig,add_cl_to_condensed_h5,ClConfig,build_and_write_training,build_training_arrays,BuildXYConfig,BuildStats,CondenseStats - Data:
make_dataloaders,load_training_arrays,DataLoaderConfig,Normalizer - Artifacts:
create_artifact_dir,save_artifact,save_configs,save_results,save_info,save_normalizers,load_normalizers,save_model_checkpoint,dataset_summary,file_fingerprint,read_json - Models:
FourParamEmulator,MCDropoutEmulator(experimental variants live inreionemu.models.experimental) - Training:
fit,FitConfig,train_one_epoch,evaluate,evaluate_metrics,evaluate_mc_metrics,kfold_cross_validate,KFoldConfig - Training helpers:
build_four_param_model,build_mc_dropout_model,build_optimizer,mse,rmse,mean_relative_error,physical_mean_relative_error - Tuning:
train_four_param_tune,default_param_space,run_tune_four_param
For full API reference, module documentation, and usage guides, visit: Homepage
Typical workflow
- Parameter sampling - Latin Hypercube Sampling over the 4D reionization parameter space.
- Simulation (HPC) - Run Zreion (or compatible) simulations; outputs per sim in HDF5.
- Dataset construction - Use
condense_sim_root→add_cl_to_condensed_h5→build_and_write_trainingto produce a single condensed HDF5 with/simsand/training. - Hyperparameter search (optional) - Use
load_training_arraysandrun_tune_four_paramto search over model and optimizer settings with Ray Tune. - Training and evaluation - Use
make_dataloadersandfit(orkfold_cross_validate) to train and evaluate the selected emulator configuration. - Artifact saving - Use
save_artifactto record JSON configs/results plus optional.npznormalizers and.ptmodel checkpoints.
Acknowledgments
This research is conducted in the LEADS Lab at the University of Nevada, Las Vegas, under Dr. Paul La Plante, with computing resources from the Pittsburgh Supercomputing Center (Bridges-2).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file reionemu-0.2.0.tar.gz.
File metadata
- Download URL: reionemu-0.2.0.tar.gz
- Upload date:
- Size: 20.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f9fbf6d2ea4e415f9bebafcbd897646654c13d843e5ef64c0b9306fcb6165da
|
|
| MD5 |
d6993e841afea6f56772e4ea396facdd
|
|
| BLAKE2b-256 |
a3277b656ac827757c0a9ef52d1d69225941d47c37cb4d1f94a563361d5876fb
|
Provenance
The following attestation bundles were made for reionemu-0.2.0.tar.gz:
Publisher:
release.yml on RobertxPearce/reionization-emulator
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
reionemu-0.2.0.tar.gz -
Subject digest:
1f9fbf6d2ea4e415f9bebafcbd897646654c13d843e5ef64c0b9306fcb6165da - Sigstore transparency entry: 1399106777
- Sigstore integration time:
-
Permalink:
RobertxPearce/reionization-emulator@0e2cfd594c8b6b1ba536d0f12f2a55214851a478 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/RobertxPearce
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0e2cfd594c8b6b1ba536d0f12f2a55214851a478 -
Trigger Event:
release
-
Statement type:
File details
Details for the file reionemu-0.2.0-py3-none-any.whl.
File metadata
- Download URL: reionemu-0.2.0-py3-none-any.whl
- Upload date:
- Size: 39.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58e4ee87291a399d07d85e51a569b6640a89594d12497195257125f57d788793
|
|
| MD5 |
a24d8cdd575b2e2afbb08dab01c6a491
|
|
| BLAKE2b-256 |
951a3a597c5910c19a01e7d77e4e29a5b988453213f434e00897ee0e531745ec
|
Provenance
The following attestation bundles were made for reionemu-0.2.0-py3-none-any.whl:
Publisher:
release.yml on RobertxPearce/reionization-emulator
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
reionemu-0.2.0-py3-none-any.whl -
Subject digest:
58e4ee87291a399d07d85e51a569b6640a89594d12497195257125f57d788793 - Sigstore transparency entry: 1399106783
- Sigstore integration time:
-
Permalink:
RobertxPearce/reionization-emulator@0e2cfd594c8b6b1ba536d0f12f2a55214851a478 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/RobertxPearce
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0e2cfd594c8b6b1ba536d0f12f2a55214851a478 -
Trigger Event:
release
-
Statement type: