Hyperparameter optimization extensions for Minerva.
Project description
Minerva-OPT
Hyperparameter optimization extensions for Minerva, powered by Ray Tune.
Description
Minerva-OPT provides a RayHyperParameterSearch pipeline that wraps Ray Tune and PyTorch Lightning to run distributed hyperparameter searches on top of any Minerva-compatible model. It supports random search, grid search, and Bayesian optimization (via HyperOpt), with early stopping through the ASHA scheduler.
Features
- Drop-in Minerva pipeline: inherits from
minerva.pipelines.base.Pipeline, so it integrates with Minerva's logging, reproducibility, and run-status tracking out of the box. - Flexible search algorithms: use Ray Tune's default random/grid search or pass any
ray.tune.search.Searcher(e.g.HyperOptSearchfor Bayesian optimization). - ASHA early stopping: trials are stopped early based on intermediate results;
grace_periodandmax_tare derived automatically frommax_epochs. - Distributed training: uses
RayDDPStrategyandRayLightningEnvironmentfor multi-worker trials. - Checkpointing: configurable number of checkpoints per trial, scored on the target metric.
Installation
Requires Python 3.10+.
With uv (recommended)
uv pip install minerva-opt
With pip
pip install minerva-opt
To use Bayesian optimization via HyperOpt, install the optional extra:
pip install "minerva-opt[hyperopt]"
Usage
Random / grid search
from ray import tune
from minerva_opt.pipelines.hyperparameter_search import RayHyperParameterSearch
search_space = {
"learning_rate": tune.loguniform(1e-4, 1e-1),
"hidden_size": tune.choice([64, 128, 256]),
}
pipeline = RayHyperParameterSearch(
model=MyLightningModel, # class, not instance — instantiated per trial as MyLightningModel(**config)
search_space=search_space,
log_dir="logs/",
)
results = pipeline.run(data=my_data_module, num_samples=20, max_epochs=50)
best = results.get_best_result()
print(best.config)
Bayesian optimization with HyperOpt
from ray.tune.search.hyperopt import HyperOptSearch
from hyperopt import hp
search_space = {
"learning_rate": tune.loguniform(1e-4, 1e-1),
"dropout": tune.uniform(0.1, 0.5),
}
pipeline = RayHyperParameterSearch(
model=MyLightningModel,
search_space=search_space,
)
results = pipeline.run(
data=my_data_module,
search_alg=HyperOptSearch(),
num_samples=30,
max_epochs=100,
tuner_metric="val_loss",
tuner_mode="min",
)
Testing the best model after search
# Run search first, then test using the best checkpoint automatically
results = pipeline.run(data=my_data_module, num_samples=20, max_epochs=50)
pipeline.run(data=my_data_module, task="test")
# Or test from an explicit checkpoint path
pipeline.run(data=my_data_module, task="test", ckpt_path="path/to/model.ckpt")
Resuming an interrupted search
# Resume from the Ray experiment directory saved under log_dir
pipeline.run(
data=my_data_module,
restore_path="logs/TorchTrainer_2024-01-01_00-00-00",
)
Using a data factory (recommended for long searches)
Passing a factory callable avoids deepcopying the data module for every trial, which is safer for data modules with file handles or non-picklable state:
pipeline.run(
data=my_data_module, # still required for _test
data_factory=lambda: MyDataModule(root="data/"),
)
Key run() parameters
| Parameter | Default | Description |
|---|---|---|
data |
— | LightningDataModule for training and testing |
task |
"search" |
"search" to run the sweep, "test" to evaluate best checkpoint |
ckpt_path |
None |
Warm-start all trials from a checkpoint (search) or eval path (test) |
data_factory |
None |
Callable that returns a fresh LightningDataModule per trial |
num_samples |
10 |
Number of trials to run |
max_epochs |
100 |
Max epochs per trial |
tuner_metric |
"val_loss" |
Metric to optimize |
tuner_mode |
"min" |
"min" or "max" |
search_alg |
None |
Any ray.tune.search.Searcher; None = random search |
max_concurrent |
4 |
Max concurrent trials (when using a search_alg) |
scheduler |
ASHA | Override the trial scheduler |
scaling_config |
Auto-detected (GPU or CPU) | Override Ray ScalingConfig |
resources_per_worker |
{"GPU": 1} when GPU detected |
Custom resource dict, e.g. {"GPU": 0.5} for fractional GPU |
checkpoint_interval |
1 |
Save a checkpoint every N epochs |
num_checkpoints_to_keep |
1 |
Number of top checkpoints to retain per trial |
restore_path |
None |
Path to a Ray experiment dir to resume an interrupted search |
debug_mode |
False |
Disable checkpointing for fast iteration |
GPU detection: when
scaling_configis not provided, the pipeline auto-detects GPU availability. AUserWarningis emitted if falling back to CPU. Pass an explicitscaling_configto suppress it.
Requirements
minerva >= 0.3.10b0ray[tune] >= 2.55hyperopt >= 0.2.7(optional — only needed forHyperOptSearch)
License
MIT License. See LICENSE for details.
Contact
For questions or bug reports, open an issue on the GitHub issue tracker.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file minerva_opt-1.2.0.tar.gz.
File metadata
- Download URL: minerva_opt-1.2.0.tar.gz
- Upload date:
- Size: 18.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3ce720ef152799bf01e80544f072758c6128cf9a9da62df39a8ba27db39cc72
|
|
| MD5 |
141f66d1fb5c678784ded27f230a776d
|
|
| BLAKE2b-256 |
a2b29a8c904ed85eda269d6c1723edeae5e13530355197f20ffddb57d7c23bdd
|
Provenance
The following attestation bundles were made for minerva_opt-1.2.0.tar.gz:
Publisher:
auto-release.yml on GabrielBG0/Minerva-OPT
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
minerva_opt-1.2.0.tar.gz -
Subject digest:
b3ce720ef152799bf01e80544f072758c6128cf9a9da62df39a8ba27db39cc72 - Sigstore transparency entry: 1934393312
- Sigstore integration time:
-
Permalink:
GabrielBG0/Minerva-OPT@c0470623f794598bf88c45475ac22299e5056494 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/GabrielBG0
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
auto-release.yml@c0470623f794598bf88c45475ac22299e5056494 -
Trigger Event:
push
-
Statement type:
File details
Details for the file minerva_opt-1.2.0-py3-none-any.whl.
File metadata
- Download URL: minerva_opt-1.2.0-py3-none-any.whl
- Upload date:
- Size: 14.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00ff6bafce218b3e509af5a4caede8d768e7c2ca6cbcf419836a4504e355bbec
|
|
| MD5 |
f9f0035473d07441cb06057d1c549fb4
|
|
| BLAKE2b-256 |
e360245c31068408e07ebdd6f155b87fd249ab488aaeb33db440f4cb5276c53d
|
Provenance
The following attestation bundles were made for minerva_opt-1.2.0-py3-none-any.whl:
Publisher:
auto-release.yml on GabrielBG0/Minerva-OPT
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
minerva_opt-1.2.0-py3-none-any.whl -
Subject digest:
00ff6bafce218b3e509af5a4caede8d768e7c2ca6cbcf419836a4504e355bbec - Sigstore transparency entry: 1934393376
- Sigstore integration time:
-
Permalink:
GabrielBG0/Minerva-OPT@c0470623f794598bf88c45475ac22299e5056494 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/GabrielBG0
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
auto-release.yml@c0470623f794598bf88c45475ac22299e5056494 -
Trigger Event:
push
-
Statement type: