A hyperparameter optimization toolbox for convenient and fast prototyping
Project description
A hyperparameter optimization and meta-learning toolbox for convenient and fast prototyping of machine-learning models.
Master status: | |
Dev status: | |
Code quality: | |
Latest versions: |
NEWS:
Hyperactive is currently in a transition phase between version 2 and 3. All of the source code for the optimization algorithms will be stored in Gradient-Free-Optimizers. Gradient-Free-Optimizers will serve as an easy to use stand alone package as well as the optimization backend for Hyperactive in the future. Until Hyperactive version 3 is released you can either switch to the 2.x.x branch for the old version in Github or use Gradient-Free-Optimizers to enjoy new algorithms and improved performance.
Hyperactive is primarly a hyperparameter optimization toolkit, that aims to simplify the model-selection and -tuning process. You can use any machine- or deep-learning package and it is not necessary to learn new syntax. Hyperactive offers high versatility in model optimization because of two characteristics:
- You can define any kind of model in the objective function. It just has to return a score/metric that gets maximized.
- The search space accepts not just 'int', 'float' or 'str' as data types but even functions, classes or any python objects.
Main features • Installation • Roadmap • Citation • License
Hyperactive features a collection of optimization algorithms that can be used for a variety of optimization problems. The following table shows listings of the capabilities of Hyperactive, where each of the items links to an example:
Optimization Techniques | Tested and Supported Packages | Optimization Applications |
Local Search:
Global Search: Population Methods: Sequential Methods: |
Machine Learning:
Deep Learning:
Distribution:
|
Feature Engineering:
|
Installation
The most recent version of Hyperactive is available on PyPi:
pip install hyperactive
Minimal example
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.datasets import load_boston
from hyperactive import Hyperactive
data = load_boston()
X, y = data.data, data.target
""" define the model in a function """
def model(opt):
""" pass the suggested parameter to the machine learning model """
gbr = GradientBoostingRegressor(
n_estimators=opt["n_estimators"]
)
scores = cross_val_score(gbr, X, y, cv=3)
""" return a single numerical value, which gets maximized """
return scores.mean()
"""
create the search space
determines the ranges of parameters you want the optimizer to search through
"""
search_space = {"n_estimators": list(range(10, 200, 5))}
""" start the optimization run """
hyper = Hyperactive()
hyper.add_search(model, search_space, n_iter=50)
hyper.run()
Hyperactive API information
Hyperactive(...)
- verbosity = ["progress_bar", "print_results", "print_times"]
- distribution = {"multiprocessing": {"initializer": tqdm.set_lock, "initargs": (tqdm.get_lock(),),}}
.add_search(...)
- model
- search_space
- n_iter
- optimizer = RandomSearchOptimizer()
- n_jobs = 1
- initialize = {"grid": 4, "random": 2, "vertices": 4}
- max_score = None
- random_state = None
- memory = True
- memory_warm_start = None
.run(...)
- max_time = None
Optimizers
HillClimbingOptimizer
- epsilon=0.05
- distribution="normal"
- n_neighbours=3
- rand_rest_p=0.03
RepulsingHillClimbingOptimizer
- epsilon=0.05
- distribution="normal"
- n_neighbours=3
- rand_rest_p=0.03
- repulsion_factor=3
SimulatedAnnealingOptimizer
- epsilon=0.05
- distribution="normal"
- n_neighbours=3
- rand_rest_p=0.03
- p_accept=0.1
- norm_factor="adaptive"
- annealing_rate=0.975
- start_temp=1
RandomSearchOptimizer
RandomRestartHillClimbingOptimizer
- epsilon=0.05
- distribution="normal"
- n_neighbours=3
- rand_rest_p=0.03
- n_iter_restart=10
RandomAnnealingOptimizer
- epsilon=0.05
- distribution="normal"
- n_neighbours=3
- rand_rest_p=0.03
- annealing_rate=0.975
- start_temp=1
ParallelTemperingOptimizer
- n_iter_swap=10
- rand_rest_p=0.03
ParticleSwarmOptimizer
- inertia=0.5
- cognitive_weight=0.5
- social_weight=0.5
- temp_weight=0.2
- rand_rest_p=0.03
EvolutionStrategyOptimizer
- mutation_rate=0.7
- crossover_rate=0.3
- rand_rest_p=0.03
BayesianOptimizer
- gpr=gaussian_process["gp_nonlinear"]
- xi=0.03
- warm_start_smbo=None
- rand_rest_p=0.03
TreeStructuredParzenEstimators
- gamma_tpe=0.5
- warm_start_smbo=None
- rand_rest_p=0.03
DecisionTreeOptimizer
- tree_regressor="extra_tree"
- xi=0.01
- warm_start_smbo=None
- rand_rest_p=0.03
EnsembleOptimizer
- estimators=[
GradientBoostingRegressor(n_estimators=5),
GaussianProcessRegressor(),
]
- xi=0.01
- warm_start_smbo=None
- rand_rest_p=0.03
Roadmap
v2.0.0:heavy_check_mark:
- Change API
v2.1.0:heavy_check_mark:
- Save memory of evaluations for later runs (long term memory)
- Warm start sequence based optimizers with long term memory
- Gaussian process regressors from various packages (gpy, sklearn, GPflow, ...) via wrapper
v2.2.0:heavy_check_mark:
- Add basic dataset meta-features to long term memory
- Add helper-functions for memory
- connect two different model/dataset hashes
- split two different model/dataset hashes
- delete memory of model/dataset
- return best known model for dataset
- return search space for best model
- return best parameter for best model
v2.3.0:heavy_check_mark:
- Tree-structured Parzen Estimator
- Decision Tree Optimizer
- add "max_sample_size" and "skip_retrain" parameter for sbom to decrease optimization time
v3.0.0
- New API
- expand usage of objective-function
- No passing of training data into Hyperactive
- Removing "long term memory"-support (better to do in separate package)
- More intuitive selection of optimization strategies and parameters
- Separate optimization algorithms into other package
- expand api so that optimizer parameter can be changed at runtime
- add extensive testing procedure (similar to Gradient-Free-Optimizers)
Experimental algorithms
The following algorithms are of my own design and, to my knowledge, do not yet exist in the technical literature. If any of these algorithms already exist I would like you to share it with me in an issue.
Random Annealing
A combination between simulated annealing and random search.
References
[dto] Scikit-Optimize
Citing Hyperactive
@Misc{hyperactive2019,
author = {{Simon Blanke}},
title = {{Hyperactive}: A hyperparameter optimization and meta-learning toolbox for machine-/deep-learning models.},
howpublished = {\url{https://github.com/SimonBlanke}},
year = {since 2019}
}
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for hyperactive-3.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27fabef3cd21864d4d705daaf735eb2df3ac526d186f49c2cefaa2510074c034 |
|
MD5 | 640dad4b14b5b5890309ecb893862ce3 |
|
BLAKE2b-256 | c4eebde84c79882ffd29f466daa9ac1c6d6c328c76acf18b020e23567d1250eb |