Skip to main content

A reproducible stability-selection pipeline for scientific machine learning

Project description

RobustModelMaker

License: MIT Python Version PyPI

A reproducible model-building pipeline for small-to-medium scientific datasets.

RobustModelMaker (ROBUST) combines bootstrap stability selection with leakage-safe nested cross-validation to identify a stable, minimal feature subset and produce honest performance estimates. It is designed for scientific datasets where reproducibility, interpretability, and honest generalisation estimates matter as much as raw predictive performance.


Why RobustModelMaker?

Standard machine learning pipelines applied to scientific data suffer from two problems that ROBUST addresses directly:

Optimistic performance estimates. When feature selection, hyperparameter tuning, and model evaluation share the same data, the reported score reflects the data used for model building, not future data. ROBUST uses strict nested cross-validation in which each of those steps is performed entirely on the training partition of each fold. The test partition is used only to evaluate the final fold model, never to inform any modelling decision.

Unstable feature selection. Single-run feature selection produces a feature set that can change substantially with small changes in the data. ROBUST runs bootstrap stability selection: features are ranked by how consistently they are selected across hundreds of random subsamples of the training data. Only features that exceed a stability threshold (70% of bootstrap runs by default) are retained.

The result is a model built on a smaller, more reproducible feature set whose estimated performance is trustworthy.


Installation

pip install robustmodelmaker

For XGBoost support:

pip install robustmodelmaker[xgb]

Requirements: Python >= 3.9, numpy, pandas, scikit-learn, scipy


Quick start

import pandas as pd
from robustmodelmaker import RobustModelMaker

X = pd.read_csv("features.csv")
y = pd.read_csv("labels.csv").squeeze()

maker = RobustModelMaker(
    alg="eln",           # elastic net: interpretable and fast
    task_type="binary",  # always set explicitly: "binary", "multiclass", or "regression"
    outer_cv=5,
    inner_cv=5,
    n_bootstrap=100,
    stability_threshold=0.7,
    random_state=42,
).fit(X, y)

result = maker.result_
print(f"Selected {len(result.selected_features)} of {len(result.feature_names)} features")
print(f"Nested CV AUC: {result.mean_score:.4f} +/- {result.std_score:.4f}")

# Predict on new data (preprocessing and feature selection applied automatically)
predictions = maker.predict(X_new)
probabilities = maker.predict_proba(X_new)

The functional API is also available:

from robustmodelmaker import run_pipeline

result = run_pipeline(X, y, alg="eln", task_type="binary",
                      outer_cv=5, inner_cv=5, random_state=42)

Algorithms

Code Model Tasks Notes
eln Elastic net all Fastest; coefficient-based importance; auto-scales
rdg Ridge (L2) all Stable; good default for many scientific datasets
las Lasso (L1) all Sparse coefficients; strong feature selector
log L2 logistic regression classification Reliable baseline
svm Linear SVM all Effective in high-dimensional spaces
rf Random forest all Non-linear; no scaling needed; class_weight balanced
xgb XGBoost all Highest raw performance; requires pip install robustmodelmaker[xgb]
mlp Multi-layer perceptron all Neural baseline; slower on small datasets
lin Linear regression (OLS) regression only Interpretable; no regularisation

Key capabilities

Capability Detail
Task types Binary classification, multiclass classification, regression
Feature selection Bootstrap stability selection with configurable threshold and bootstrap count
Performance estimation Nested CV (outer + inner), repeated nested CV, grouped CV
Preprocessing Median imputation + optional standard scaling, fitted inside each fold
Missing data NaN-tolerant by default; optional data-driven missingness filter
Cutoff determination Bootstrap specificity-targeted threshold for binary classification
Probability calibration Platt scaling (sigmoid) or isotonic regression
Post-hoc analysis Permutation importance, SHAP-ready export, feature stability plots
External validation One-call evaluation on a held-out set with full metric suite
Reproducibility Fully deterministic given a fixed random seed, verified by test suite
Save/load JSON metadata, CSV tables, and pickle of the fitted result

Saving results

# Save at fit time
maker = RobustModelMaker(
    alg="eln", task_type="binary",
    save_results=True,
    output_dir="results/",
    output_prefix="my_model",
    random_state=42,
).fit(X, y)

# Or save afterwards
maker.save_results(output_dir="results/", output_prefix="my_model")

Saves: JSON metadata, full pickle, per-fold score CSVs, stability selection table, and a formatted text summary.


External validation

maker = RobustModelMaker(alg="eln", task_type="binary", random_state=42)
maker.fit(X_train, y_train, X_validation=X_val, y_validation=y_val)

val = maker.result_.validation_result
print(val.metrics)   # auc, accuracy, sensitivity, specificity, ...

Permutation importance

pi = maker.permutation_importance(X_val, y_val, n_repeats=20, random_state=42)
print(pi.summary().head(10))

SHAP integration

shap_data = maker.result_.export_shap_ready(X)

import shap
explainer = shap.LinearExplainer(shap_data["model"], shap_data["X"])
shap_values = explainer.shap_values(shap_data["X"])
shap.summary_plot(shap_values, shap_data["X"])

Grouped cross-validation

maker = RobustModelMaker(alg="eln", task_type="regression", random_state=42)
maker.fit(X, y, groups=sample_ids)   # prevents leakage across experimental units

Benchmark results

Three real scientific datasets evaluated against a full-feature nested-CV baseline using the same algorithm and fold structure:

Dataset Task n x p ROBUST feats Reduction Metric Outcome
SECOM Manufacturing binary 1254 x 590 ~47 ~92% AUC (higher=better) preserved
Urban Land Cover multiclass 540 x 147 ~31 ~79% AUC-OVR (higher=better) preserved
Graphene Oxide Bulk regression 1294 x 412 ~68 ~83% RMSE in eV (lower=better) preserved

preserved means the performance difference between ROBUST and the full-feature baseline is not statistically significant (paired Wilcoxon, p >= 0.05). ROBUST achieves comparable predictive performance using a small fraction of the available features.

Note on split methodology: All benchmarks use BenchMake archetypal splits, which are adversarial by design. BenchMake selects maximally representative train/test partitions that keep the two sets apart in feature space, producing more conservative scores than conventional random splits. This is intentional: the benchmark is a worst-case assessment. Scores on your own data with default random splits will typically be higher. The ROBUST vs. full-feature baseline comparison within each benchmark is internally consistent because both models use the same split.


Citing this work

Barnard, A. S. (2026). RobustModelMaker: A reproducible stability-selection pipeline
for scientific machine learning (v0.3). GitHub: https://github.com/amaxiom/RobustModelMaker

Documentation

Full documentation is available in the GitHub repository:

  • User Guide: parameters, methods, prediction, validation, SHAP, saving
  • Implementation Guide: internal design, algorithm details, tuning for speed and rigor
  • Interpretation Guide: reading results correctly, statistical tests, what to report in a paper

Author

Prof Amanda S Barnard GitHub: amaxiom

RobustModelMaker is developed and maintained as a tool for rigorous, reproducible machine learning in scientific research.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

robustmodelmaker-0.3.0.tar.gz (28.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

robustmodelmaker-0.3.0-py3-none-any.whl (25.2 kB view details)

Uploaded Python 3

File details

Details for the file robustmodelmaker-0.3.0.tar.gz.

File metadata

  • Download URL: robustmodelmaker-0.3.0.tar.gz
  • Upload date:
  • Size: 28.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for robustmodelmaker-0.3.0.tar.gz
Algorithm Hash digest
SHA256 55e1e31f448ec3adacd3ef98c4dd94dea2fae4a57728ed2cbae5fee17a164c7e
MD5 939ac92fb6c9e06d7bcd57e23d56159a
BLAKE2b-256 dd4ebd0c100c4dfa1c06878562027a737bc83245ebb28cd22c4ed256b0460f19

See more details on using hashes here.

File details

Details for the file robustmodelmaker-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for robustmodelmaker-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d8b1547171c388b0817262abef6e3bae4d55888d69fffa4e2c6787f114b1e2b1
MD5 eaff9395e3c23ae19d44a276f70c271f
BLAKE2b-256 4315b870a64ddb8d0e875205e65e91771bc851ae4421713aa1e606afe8bdf42d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page