No project description provided
Project description
Simple and reliable optimization with local, global, populationbased and sequential techniques in numerical discrete search spaces.
Master status:  
Code quality:  
Latest versions: 
Introduction
GradientFreeOptimizers provides a collection of easy to use optimization techniques, whose objective function only requires an arbitrary score that gets maximized. This makes gradientfree methods capable of solving various optimization problems, including:
 Optimizing arbitrary mathematical functions.
 Fitting multiple gaussdistributions to data.
 Hyperparameteroptimization of machinelearning methods.
GradientFreeOptimizers is the optimization backend of Hyperactive (in v3.0.0 and higher) but it can also be used by itself as a leaner and simpler optimization toolkit.
Main features

Easy to use:
Simple APIdesign
You can optimize anything that can be defined in a python function. For example a simple parabola function:
def objective_function(para): score = para["x1"] * para["x1"] return score
Define where to search via numpy ranges:
search_space = { "x": np.arange(0, 5, 0.1), }
That`s all the information the algorithm needs to search for the maximum in the objective function:
from gradient_free_optimizers import RandomSearchOptimizer opt = RandomSearchOptimizer(search_space) opt.search(objective_function, n_iter=100000)
Receive prepared information about ongoing and finished optimization runs
During the optimization you will receive ongoing information in a progress bar:
 current best score
 the position in the search space of the current best score
 the iteration when the current best score was found
 other information about the progress native to tqdm

High performance:
Modern optimization techniques
GradientFreeOptimizers provides not just metaheuristic optimization methods but also sequential model based optimizers like bayesian optimization, which delivers good results for expensive objetive functions like deeplearning models.
Lightweight backend
Even for the very simple parabola function the optimization time is about 60% of the entire iteration time when optimizing with random search. This shows, that (despite all its features) GradientFreeOptimizers has an efficient optimization backend without any unnecessary slowdown.
Save time with memory dictionary
Per default GradientFreeOptimizers will look for the current position in a memory dictionary before evaluating the objective function.

If the position is not in the dictionary the objective function will be evaluated and the position and score is saved in the dictionary.

If a position is already saved in the dictionary GradientFreeOptimizers will just extract the score from it instead of evaluating the objective function. This avoids reevaluating computationally expensive objective functions (machine or deeplearning) and therefore saves time.


High reliability:
Extensive testing
GradientFreeOptimizers is extensivly tested with more than 400 tests in 2500 lines of test code. This includes the testing of:
 Each optimization algorithm
 Each optimization parameter
 All attributes that are part of the public api
Performance test for each optimizer
Each optimization algorithm must perform above a certain threshold to be included. Poorly performing algorithms are reworked or scraped.
Optimization strategies:
GradientFreeOptimizers supports a variety of optimization algorithms, which can make choosing the right algorithm a tedious endeavor. The gifs in this section give a visual representation how the different optimization algorithms explore the search space and exploit the collected information about the search space for a convex and nonconvex objective function.
Optimization Strategy  Convex Function  Nonconvex Function 

<ins>Hill Climbing</ins> Evaluates the score of n neighbours in an epsilon environment and moves to the best one. 

<ins>Repulsing Hill Climbing</ins> Hill climbing iteration + increases epsilon by a factor if no better neighbour was found. 

<ins>Simulated Annealing</ins> Hill climbing iteration + accepts moving to worse positions with decreasing probability over time (transition probability). 

<ins>Random Search</ins> Moves to random positions in each iteration. 

<ins>Random Restart Hill Climbing</ins> Hill climbing + moves to a random position after n iterations. 

<ins>Random Annealing</ins> Hill Climbing + large epsilon that decreases over time. 

<ins>Parallel Tempering</ins> Population of n simulated annealers, which occasionally swap transition probabilities. 

<ins>Particle Swarm Optimization</ins> Population of n particles attracting each other and moving towards the best particle. 

<ins>Evolution Strategy</ins> Population of n hill climbers occasionally mixing positional information. 

<ins>Bayesian Optimization</ins> Gaussian process fitting to explored positions and predicting promising new positions. 

<ins>Tree of Parzen Estimators</ins> Kernel density estimators fitting to good and bad explored positions and predicting promising new positions. 

<ins>Decision Tree Optimizer</ins> Ensemble of decision trees fitting to explored positions and predicting promising new positions. 
Installation
The most recent version of GradientFreeOptimizers is available on PyPi:
pip install gradientfreeoptimizers
Examples
Convex function
import numpy as np from gradient_free_optimizers import RandomSearchOptimizer def parabola_function(para): loss = para["x"] * para["x"] return loss search_space = {"x": np.arange(10, 10, 0.1)} opt = RandomSearchOptimizer(search_space) opt.search(parabola_function, n_iter=100000)
Nonconvex function
import numpy as np from gradient_free_optimizers import RandomSearchOptimizer def ackley_function(pos_new): x = pos_new["x1"] y = pos_new["x2"] a1 = 20 * np.exp(0.2 * np.sqrt(0.5 * (x * x + y * y))) a2 = np.exp(0.5 * (np.cos(2 * np.pi * x) + np.cos(2 * np.pi * y))) score = a1 + a2 + 20 return score search_space = { "x1": np.arange(100, 101, 0.1), "x2": np.arange(100, 101, 0.1), } opt = RandomSearchOptimizer(search_space) opt.search(ackley_function, n_iter=30000)
Machine learning example
import numpy as np from sklearn.model_selection import cross_val_score from sklearn.ensemble import GradientBoostingClassifier from sklearn.datasets import load_wine from gradient_free_optimizers import HillClimbingOptimizer data = load_wine() X, y = data.data, data.target def model(para): gbc = GradientBoostingClassifier( n_estimators=para["n_estimators"], max_depth=para["max_depth"], min_samples_split=para["min_samples_split"], min_samples_leaf=para["min_samples_leaf"], ) scores = cross_val_score(gbc, X, y, cv=3) return scores.mean() search_space = { "n_estimators": np.arange(20, 120, 1), "max_depth": np.arange(2, 12, 1), "min_samples_split": np.arange(2, 12, 1), "min_samples_leaf": np.arange(1, 12, 1), } opt = HillClimbingOptimizer(search_space) opt.search(model, n_iter=50)
Basic API reference
General optimization arguments
The following arguments can be passed to each optimization class:

search_space

Pass the search_space to the optimizer class to define the space were the optimization algorithm can search for the best parameters for the given objective function.
example:
... search_space = { "x1": numpy.arange(10, 31, 0.3), "x2": numpy.arange(10, 31, 0.3), } opt = HillClimbingOptimizer(search_space) ...


initialize={"grid": 8, "vertices": 8, "random": 4, "warm_start": []}

(dict, None)

The initialization dictionary automatically determines a number of parameters that will be evaluated in the first n iterations (n is the sum of the values in initialize). The initialize keywords are the following:

grid
 Initializes positions in a grid like pattern. Positions that cannot be put into a grid are randomly positioned.

vertices
 Initializes positions at the vertices of the search space. Positions that cannot be put into a vertices are randomly positioned.

random
 Number of random initialized positions

warm_start
 List of parameter dictionaries that marks additional start points for the optimization run.



random_state=None
 (int, None)
 Random state for random processes in the random, numpy and scipy module.
Optimizer Classes
Each optimization class needs the "search_space" as an input argument. Optionally "initialize" and optimizerspecific parameters can be passed as well. You can read more about each optimizationstrategy and its parameters in the Optimization Tutorial.
 HillClimbingOptimizer
 RepulsingHillClimbingOptimizer
 SimulatedAnnealingOptimizer
 DownhillSimplexOptimizer
 RandomSearchOptimizer
 GridSearchOptimizer
 RandomRestartHillClimbingOptimizer
 RandomAnnealingOptimizer
 PowellsMethod
 PatternSearch
 ParallelTemperingOptimizer
 ParticleSwarmOptimizer
 EvolutionStrategyOptimizer
 BayesianOptimizer
 TreeStructuredParzenEstimators
 ForestOptimizer
.search(...)

objective_function

(callable)

The objective function defines the optimization problem. The optimization algorithm will try to maximize the numerical value that is returned by the objective function by trying out different parameters from the search space.
example:
def objective_function(para): score = (para["x1"] * para["x1"] + para["x2"] * para["x2"]) return score


n_iter

(int)

The number of iterations that will be performed during the optimiation run. The entire iteration consists of the optimizationstep, which decides the next parameter that will be evaluated and the evaluationstep, which will run the objective function with the chosen parameter and return the score.


max_time=None
 (float, None)
 Maximum number of seconds until the optimization stops. The time will be checked after each completed iteration.

max_score=None
 (float, None)
 Maximum score until the optimization stops. The score will be checked after each completed iteration.

early_stopping=None
 (dict, None)
 Stops the optimization run early if it did not achive any scoreimprovement within the last iterations. The early_stoppingparameter enables to set three parameters:
n_iter_no_change
: Nonoptional intparameter. This marks the last n iterations to look for an improvement over the iterations that came before n. If the best score of the entire run is within those last n iterations the run will continue (until other stopping criteria are met), otherwise the run will stop.tol_abs
: Optional floatparamter. The score must have improved at least this absolute tolerance in the last n iterations over the best score in the iterations before n. This is an absolute value, so 0.1 means an imporvement of 0.8 > 0.9 is acceptable but 0.81 > 0.9 would stop the run.tol_rel
: Optional floatparamter. The score must have imporved at least this relative tolerance (in percentage) in the last n iterations over the best score in the iterations before n. This is a relative value, so 10 means an imporvement of 0.8 > 0.88 is acceptable but 0.8 > 0.87 would stop the run.

memory=True
 (bool)
 Whether or not to use the "memory"feature. The memory is a dictionary, which gets filled with parameters and scores during the optimization run. If the optimizer encounters a parameter that is already in the dictionary it just extracts the score instead of reevaluating the objective function (which can take a long time).

memory_warm_start=None

(pandas dataframe, None)

Pandas dataframe that contains score and paramter information that will be automatically loaded into the memorydictionary.
example:
score x1 x2 x... 0.756 0.1 0.2 ... 0.823 0.3 0.1 ... ... ... ... ... ... ... ... ...


verbosity=[ "progress_bar", "print_results", "print_times" ]
 (list, False)
 The verbosity list determines what part of the optimization information will be printed in the command line.
Results from attributes

.search_data

Dataframe, that contains information about the score, the value of each parameter and the evaluation and iteration time. Each row shows the information of one optimization iteration.
example:
score x1 x2 x... eval_time iter_time 0.756 0.1 0.2 ... 0.000016 0.0.000034 0.823 0.3 0.1 ... 0.000017 0.0.000032 ... ... ... ... ... ... ... ... ... ... ... ...


.best_score
 numerical value of the best score, that was found during the optimization run.

.best_para

parameter dictionary of the best score, that was found during the optimization run.
example:
{ 'x1': 0.2, 'x2': 0.3, }

Roadmap
v0.3.0 :heavy_check_mark:
 add sampling parameter to Bayesian optimizer
 add warnings parameter to Bayesian optimizer
 improve access to parameters of optimizers within populationbasedoptimizers (e.g. annealing rate of simulated annealing population in parallel tempering)
v0.4.0 :heavy_check_mark:
 add early stopping parameter
v0.5.0 :heavy_check_mark:
 add gridsearch to optimizers
 impoved performance testing for optimizers
v1.0.0
 Finalize API (1.0.0)
 add Downhillsimplex algorithm to optimizers
 add Pattern search to optimizers
 add Powell's method to optimizers
 add parallel random annealing to optimizers
 add ensembleoptimizer to optimizers
Upcoming Features
 add Bayesian 1D Bayesianoptimization to optimizers
 add other acquisition functions to smbo (Probability of improvement, Entropy search, ...)
 Improved sampling for smboptimizers
Gradient Free Optimizers <=> Hyperactive
GradientFreeOptimizers was created as the optimization backend of the Hyperactive package. Therefore the algorithms are exactly the same in both packages and deliver the same results. However you can still use GradientFreeOptimizers as a standalone package. The separation of GradientFreeOptimizers from Hyperactive enables multiple advantages:
 Even easier to use than Hyperactive
 Separate and more thorough testing
 Other developers can easily use GFOs as an optimizaton backend if desired
 Better isolation from the complex information flow in Hyperactive. GFOs only uses positions and scores in a Ndimensional searchspace. It returns only the new position after each iteration.
 a smaller and cleaner code base, if you want to explore my implementation of these optimization techniques.
While GradientFreeOptimizers is relatively simple, Hyperactive is a more complex project with additional features. The differences between GradientFreeOptimizers and Hyperactive are listed in the following table:
GradientFreeOptimizers  Hyperactive  

Search space composition  only numerical  numbers, strings and functions 
Parallel Computing  not supported  yes, via multiprocessing or joblib 
Distributed computing  not supported  yes, via data sharing at runtime 
Visualization  not supported  yes, via a streamlitdashboard 
Citation
@Misc{gfo2020,
author = {{Simon Blanke}},
title = {{GradientFreeOptimizers}: Simple and reliable optimization with local, global, populationbased and sequential techniques in numerical search spaces.},
howpublished = {\url{https://github.com/SimonBlanke}},
year = {since 2020}
}
License
GradientFreeOptimizers is licensed under the following License:
Project details
Release history Release notifications  RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size  File type  Python version  Upload date  Hashes 

Filename, size gradient_free_optimizers1.0.6py3noneany.whl (90.8 kB)  File type Wheel  Python version py3  Upload date  Hashes View 
Hashes for gradient_free_optimizers1.0.6py3noneany.whl
Algorithm  Hash digest  

SHA256  7901cf73605144524974abfddec622ddfd3432347cf71bdd42acb93d24adff26 

MD5  9546292762e1620d6f682c09b1fb4a4c 

BLAKE2256  1a23daf829d00e643672ee8e0371826b3c4598c240ddab82bdcfc78364770752 