A reproducible stability-selection pipeline for scientific machine learning

These details have not been verified by PyPI

Project links

Project description

RobustModelMaker

A reproducible model-building pipeline for small-to-medium scientific datasets.

RobustModelMaker (ROBUST) combines bootstrap stability selection with leakage-safe nested cross-validation to identify a stable, minimal feature subset and produce honest performance estimates. It is designed for scientific datasets where reproducibility, interpretability, and honest generalisation estimates matter as much as raw predictive performance.

Why RobustModelMaker?

Standard machine learning pipelines applied to scientific data suffer from two problems that ROBUST addresses directly:

Optimistic performance estimates. When feature selection, hyperparameter tuning, and model evaluation share the same data, the reported score reflects the data used for model building, not future data. ROBUST uses strict nested cross-validation in which each of those steps is performed entirely on the training partition of each fold. The test partition is used only to evaluate the final fold model, never to inform any modelling decision.

Unstable feature selection. Single-run feature selection produces a feature set that can change substantially with small changes in the data. ROBUST runs bootstrap stability selection: features are ranked by how consistently they are selected across hundreds of random subsamples of the training data. Only features that exceed a stability threshold (70% of bootstrap runs by default) are retained.

The result is a model built on a smaller, more reproducible feature set whose estimated performance is trustworthy.

Installation

pip install robustmodelmaker

For XGBoost support:

pip install robustmodelmaker[xgb]

Requirements: Python >= 3.9, numpy, pandas, scikit-learn, scipy

Quick start

import pandas as pd
from robustmodelmaker import RobustModelMaker

X = pd.read_csv("features.csv")
y = pd.read_csv("labels.csv").squeeze()

maker = RobustModelMaker(
    alg="eln",           # elastic net: interpretable and fast
    task_type="binary",  # always set explicitly: "binary", "multiclass", or "regression"
    outer_cv=5,
    inner_cv=5,
    n_bootstrap=100,
    stability_threshold=0.7,
    random_state=42,
).fit(X, y)

result = maker.result_
print(f"Selected {len(result.selected_features)} of {len(result.feature_names)} features")
print(f"Nested CV AUC: {result.mean_score:.4f} +/- {result.std_score:.4f}")

# Predict on new data (preprocessing and feature selection applied automatically)
predictions = maker.predict(X_new)
probabilities = maker.predict_proba(X_new)

The functional API is also available:

from robustmodelmaker import run_pipeline

result = run_pipeline(X, y, alg="eln", task_type="binary",
                      outer_cv=5, inner_cv=5, random_state=42)

Algorithms

Code	Model	Tasks	Notes
`eln`	Elastic net	all	Fastest; coefficient-based importance; auto-scales
`rdg`	Ridge (L2)	all	Stable; good default for many scientific datasets
`las`	Lasso (L1)	all	Sparse coefficients; strong feature selector
`log`	L2 logistic regression	classification	Reliable baseline
`svm`	Linear SVM	all	Effective in high-dimensional spaces
`rf`	Random forest	all	Non-linear; no scaling needed; class_weight balanced
`xgb`	XGBoost	all	Highest raw performance; requires `pip install robustmodelmaker[xgb]`
`mlp`	Multi-layer perceptron	all	Neural baseline; slower on small datasets
`lin`	Linear regression (OLS)	regression only	Interpretable; no regularisation

Key capabilities

Capability	Detail
Task types	Binary classification, multiclass classification, regression
Feature selection	Bootstrap stability selection with configurable threshold and bootstrap count
Performance estimation	Nested CV (outer + inner), repeated nested CV, grouped CV
Preprocessing	Median imputation + optional standard scaling, fitted inside each fold
Missing data	NaN-tolerant by default; optional data-driven missingness filter
Cutoff determination	Bootstrap specificity-targeted threshold for binary classification
Probability calibration	Platt scaling (sigmoid) or isotonic regression
Post-hoc analysis	Permutation importance, SHAP-ready export, feature stability plots
External validation	One-call evaluation on a held-out set with full metric suite
Reproducibility	Fully deterministic given a fixed random seed, verified by test suite
Save/load	JSON metadata, CSV tables, and pickle of the fitted result

Saving results

# Save at fit time
maker = RobustModelMaker(
    alg="eln", task_type="binary",
    save_results=True,
    output_dir="results/",
    output_prefix="my_model",
    random_state=42,
).fit(X, y)

# Or save afterwards
maker.save_results(output_dir="results/", output_prefix="my_model")

Saves: JSON metadata, full pickle, per-fold score CSVs, stability selection table, and a formatted text summary.

External validation

maker = RobustModelMaker(alg="eln", task_type="binary", random_state=42)
maker.fit(X_train, y_train, X_validation=X_val, y_validation=y_val)

val = maker.result_.validation_result
print(val.metrics)   # auc, accuracy, sensitivity, specificity, ...

Permutation importance

pi = maker.permutation_importance(X_val, y_val, n_repeats=20, random_state=42)
print(pi.summary().head(10))

SHAP integration

shap_data = maker.result_.export_shap_ready(X)

import shap
explainer = shap.LinearExplainer(shap_data["model"], shap_data["X"])
shap_values = explainer.shap_values(shap_data["X"])
shap.summary_plot(shap_values, shap_data["X"])

Grouped cross-validation

maker = RobustModelMaker(alg="eln", task_type="regression", random_state=42)
maker.fit(X, y, groups=sample_ids)   # prevents leakage across experimental units

Benchmark results

Three real scientific datasets evaluated against a full-feature nested-CV baseline using the same algorithm and fold structure. All three benchmarks use Random Forest (rf) for both ROBUST and the baseline, isolating the effect of bootstrap stability selection from any algorithm differences:

Dataset	Task	n_train x p	ROBUST feats	Reduction	BL score	ROBUST score	p	Outcome
SECOM Manufacturing	binary	1253 x 590	301	49.0%	0.6814 AUC	0.6835 AUC	0.770	preserved
Urban Land Cover	multiclass	540 x 147	66	55.1%	0.9827 AUC	0.9849 AUC	0.432	preserved
Graphene Oxide Bulk	regression	1293 x 309	150	51.5%	0.0266 RMSE	0.0343 RMSE	0.193	preserved

Classification metrics are AUC-ROC (binary) and weighted OVR AUC (multiclass), higher is better. Regression metric is RMSE in eV, lower is better. The p-value column is from the paired Wilcoxon signed-rank test on per-fold scores. Across all three tasks ROBUST roughly halves the feature count with no statistically significant change in performance, yielding score-per-feature efficiency gains of 1.97x to 2.66x.

preserved is the primary success criterion: the stability-selected feature subset achieves statistically equivalent performance to the full-feature baseline (paired Wilcoxon, p >= 0.05) while using a fraction of the features. The selected features are robust across bootstrap resamples of the training data, not optimal for any single model fit; a small non-significant performance difference from the baseline is the expected and intended outcome.

Note on split methodology: All benchmarks use BenchMake archetypal splits, which are adversarial by design. BenchMake selects maximally representative train/test partitions that keep the two sets apart in feature space, producing more conservative scores than conventional random splits. This is intentional: the benchmark is a worst-case assessment. Scores on your own data with default random splits will typically be higher. The ROBUST vs. full-feature baseline comparison within each benchmark is internally consistent because both models use the same split.

Citing this work

Barnard, A. S. (2026). RobustModelMaker: A reproducible stability-selection pipeline
for scientific machine learning (v0.3). GitHub: https://github.com/amaxiom/RobustModelMaker

Documentation

Full documentation is available in the GitHub repository:

User Guide: parameters, methods, prediction, validation, SHAP, saving
Implementation Guide: internal design, algorithm details, tuning for speed and rigor
Interpretation Guide: reading results correctly, statistical tests, what to report in a paper

Author

Prof Amanda S Barnard GitHub: amaxiom

RobustModelMaker is developed and maintained as a tool for rigorous, reproducible machine learning in scientific research.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.2

May 20, 2026

0.3.1

May 18, 2026

0.3.0

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

robustmodelmaker-0.3.2.tar.gz (29.0 kB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

robustmodelmaker-0.3.2-py3-none-any.whl (25.6 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file robustmodelmaker-0.3.2.tar.gz.

File metadata

Download URL: robustmodelmaker-0.3.2.tar.gz
Upload date: May 20, 2026
Size: 29.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for robustmodelmaker-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`a72a4424732efbb14c80e1cc12cfe5cc637a8b8b77fea547b3c8195b11f25aec`
MD5	`30cf310881833c909aa382ce3832aded`
BLAKE2b-256	`4386578fb849fed8b44ca7eee2bc0d47d5bd2a0a930d23a73fab31b48d884c75`

See more details on using hashes here.

File details

Details for the file robustmodelmaker-0.3.2-py3-none-any.whl.

File metadata

Download URL: robustmodelmaker-0.3.2-py3-none-any.whl
Upload date: May 20, 2026
Size: 25.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for robustmodelmaker-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a321c9e589fca37a49845c97f5ca485b7f35e3a750fe826f4f5e0ad453bbb1d6`
MD5	`241bd152af533f6b0f4366a337a5f52b`
BLAKE2b-256	`3ee8771ce57d5161b044403da4810b2205b01934c7c88a46e896cbe8d73105a1`

See more details on using hashes here.

robustmodelmaker 0.3.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RobustModelMaker

Why RobustModelMaker?

Installation

Quick start

Algorithms

Key capabilities

Saving results

External validation

Permutation importance

SHAP integration

Grouped cross-validation

Benchmark results

Citing this work

Documentation

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes