Skip to main content

A hyperparameter optimization toolbox for convenient and fast prototyping

Project description

Hyperactive


A hyperparameter optimization toolbox for convenient and fast prototyping



Overview | Performance | Installation | Examples | Advanced Features | Hyperactive API | License



Overview

  • Optimize hyperparameters of machine- or deep-learning models, using a simple API.
  • Choose from a variety of different optimization techniques to improve your model.
  • Never lose progress of previous optimizations: Just pass one or more models as start points and continue optimizing.
  • Use transfer learning during the optimization process to build a more accurate model, while saving training and optimization time.
  • Utilize multiprocessing for machine learning or your gpu for deep learning models.
Optimization Techniques Supported Packages
Local Search:
  • Hill Climbing
  • Stochastic Hill Climbing
  • Tabu Search
Random Methods:
  • Random Search
  • Random Restart Hill Climbing
  • Random Annealing
Markov Chain Monte Carlo:
  • Simulated Annealing
  • Stochastic Tunneling
  • Parallel Tempering
Population Methods:
  • Particle Swarm Optimizer
  • Evolution Strategy
Sequential Methods:
  • Bayesian Optimization
Machine Learning:
  • Scikit-learn
  • XGBoost
Deep Learning:
  • Keras
Distribution:
  • Multiprocessing

Performance

The bar chart below shows, that the optimization process itself represents only a small fraction (<0.6%) of the computation time. The 'No Opt'-bar shows the training time of a default Gradient-Boosting-Classifier normalized to 1. The other bars show the computation time relative to 'No Opt'. Each optimizer did 30 runs of 300 iterations, to get a good statistic.


Installation

Hyperactive is developed and tested in python 3:

https://pypi.org/project/hyperactive https://github.com/SimonBlanke/Hyperactive/graphs/contributors https://github.com/SimonBlanke/Hyperactive/commits/master


Hyperactive is available on PyPi:

https://pypi.python.org/pypi/hyperactive https://pypi.python.org/pypi/hyperactive

pip install hyperactive

Examples

Basic sklearn example:

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

from hyperactive import SimulatedAnnealingOptimizer

iris_data = load_iris()
X = iris_data.data
y = iris_data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

# this defines the model and hyperparameter search space
search_config = {
    "sklearn.ensemble.RandomForestClassifier": {
        "n_estimators": range(10, 100, 10),
        "max_depth": [3, 4, 5, 6],
        "criterion": ["gini", "entropy"],
        "min_samples_split": range(2, 21),
        "min_samples_leaf": range(2, 21),
    }
}

Optimizer = SimulatedAnnealingOptimizer(search_config, n_iter=100, n_jobs=4)

# search best hyperparameter for given data
Optimizer.fit(X_train, y_train)

# predict from test data
prediction = Optimizer.predict(X_test)

# calculate accuracy score
score = Optimizer.score(X_test, y_test)

Example with a multi-layer-perceptron in keras:

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from hyperactive import ParticleSwarmOptimizer

breast_cancer_data = load_breast_cancer()

X = breast_cancer_data.data
y = breast_cancer_data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

# this defines the structure of the model and the search space in each layer
search_config = {
    "keras.compile.0": {"loss": ["binary_crossentropy"], "optimizer": ["adam"]},
    "keras.fit.0": {"epochs": [3], "batch_size": [100], "verbose": [0]},
    "keras.layers.Dense.1": {
        "units": range(5, 15),
        "activation": ["relu"],
        "kernel_initializer": ["uniform"],
    },
    "keras.layers.Dense.2": {
        "units": range(5, 15),
        "activation": ["relu"],
        "kernel_initializer": ["uniform"],
    },
    "keras.layers.Dense.3": {"units": [1], "activation": ["sigmoid"]},
}

Optimizer = ParticleSwarmOptimizer(
    search_config, n_iter=3, metric=["mean_absolute_error"], verbosity=0
)
# search best hyperparameter for given data
Optimizer.fit(X_train, y_train)

Example with a convolutional neural network in keras:

import numpy as np

from keras.datasets import mnist
from keras.utils import to_categorical

from hyperactive import RandomSearchOptimizer

(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(60000, 28, 28, 1)
X_test = X_test.reshape(10000, 28, 28, 1)

y_train = to_categorical(y_train)
y_test = to_categorical(y_test)


# this defines the structure of the model and the search space in each layer
search_config = {
    "keras.compile.0": {"loss": ["categorical_crossentropy"], "optimizer": ["adam"]},
    "keras.fit.0": {"epochs": [20], "batch_size": [500], "verbose": [2]},
    "keras.layers.Conv2D.1": {
        "filters": [32, 64, 128],
        "kernel_size": range(3, 4),
        "activation": ["relu"],
        "input_shape": [(28, 28, 1)],
    },
    "keras.layers.MaxPooling2D.2": {"pool_size": [(2, 2)]},
    "keras.layers.Conv2D.3": {
        "filters": [16, 32, 64],
        "kernel_size": [3],
        "activation": ["relu"],
    },
    "keras.layers.MaxPooling2D.4": {"pool_size": [(2, 2)]},
    "keras.layers.Flatten.5": {},
    "keras.layers.Dense.6": {"units": range(30, 200, 10), "activation": ["softmax"]},
    "keras.layers.Dropout.7": {"rate": list(np.arange(0.4, 0.8, 0.1))},
    "keras.layers.Dense.8": {"units": [10], "activation": ["softmax"]},
}

Optimizer = RandomSearchOptimizer(search_config, n_iter=20)

# search best hyperparameter for given data
Optimizer.fit(X_train, y_train)

# predict from test data
prediction = Optimizer.predict(X_test)

# calculate accuracy score
score = Optimizer.score(X_test, y_test)


Advanced Features

The features listed below can be activated during the instantiation of the optimizer (see API) and works with every optimizer in the hyperactive package.

Memory

After the evaluation of a model the position (in the hyperparameter search dictionary) and the cross-validation score are written to a dictionary. If the optimizer tries to evaluate this position again it can quickly lookup if a score for this position is present and use it instead of going through the extensive training and prediction process.

Scatter-Initialization

This technique was inspired by the 'Hyperband Optimization' and aims to find a good initial position for the optimization. It does so by evaluating n random positions with a training subset of 1/n the size of the original dataset. The position that achieves the best score is used as the starting position for the optimization.

Multiprocessing

The multiprocessing in hyperactive works by creating additional searches, that run in parallel without any shared memory. This provides the possibility of hyperparameter-tuning of different models at the same time. If one single model should be tuned as fast as possible n_jobs in the optimizer should be set to '1', while n_jobs (of the model) in the search_config should be set to '-1'.

Two searches with eight cpu-cores:

import numpy as np

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

from hyperactive import RandomSearchOptimizer

iris_data = load_iris()
X = iris_data.data
y = iris_data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

# this defines the model and hyperparameter search space
search_config = {
    "sklearn.ensemble.RandomForestClassifier": {
        "n_estimators": range(10, 100, 10),
        "max_depth": [3, 4, 5, 6],
        "criterion": ["gini", "entropy"],
        "n_jobs": [4],
    },
    "sklearn.ensemble.GradientBoostingClassifier": {
        "n_estimators": range(10, 100, 10),
        "max_depth": range(1, 11),
        "min_samples_split": range(2, 21),
        "n_jobs": [4],
    },
}

Optimizer = RandomSearchOptimizer(search_config, n_iter=300, n_jobs=2, verbosity=0)

# search best hyperparameter for given data
Optimizer.fit(X_train, y_train)

One search with all cpu-cores:

import numpy as np

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

from hyperactive import RandomSearchOptimizer

iris_data = load_iris()
X = iris_data.data
y = iris_data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

# this defines the model and hyperparameter search space
search_config = {
    "sklearn.ensemble.RandomForestClassifier": {
        "n_estimators": range(10, 100, 10),
        "max_depth": [3, 4, 5, 6],
        "criterion": ["gini", "entropy"],
        "n_jobs": [-1],
    },
}

Optimizer = RandomSearchOptimizer(search_config, n_iter=300, n_jobs=1, verbosity=0)

# search best hyperparameter for given data
Optimizer.fit(X_train, y_train)

Multiple searches with all cpu-cores:

import numpy as np

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

from hyperactive import RandomSearchOptimizer

iris_data = load_iris()
X = iris_data.data
y = iris_data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

# this defines the model and hyperparameter search space
search_config = {
    "sklearn.ensemble.RandomForestClassifier": {
        "n_estimators": range(10, 100, 10),
        "max_depth": [3, 4, 5, 6],
        "criterion": ["gini", "entropy"],
        "min_samples_split": range(2, 21),
        "min_samples_leaf": range(2, 21),
    },
    "sklearn.neighbors.KNeighborsClassifier": {
        "n_neighbors": range(1, 10),
        "weights": ["uniform", "distance"],
        "p": [1, 2],
    },
    "sklearn.ensemble.GradientBoostingClassifier": {
        "n_estimators": range(10, 100, 10),
        "learning_rate": [1e-3, 1e-2, 1e-1, 0.5, 1.0],
        "max_depth": range(1, 11),
        "min_samples_split": range(2, 21),
        "min_samples_leaf": range(1, 21),
        "subsample": np.arange(0.05, 1.01, 0.05),
        "max_features": np.arange(0.05, 1.01, 0.05),
    },
    "sklearn.tree.DecisionTreeClassifier": {
        "criterion": ["gini", "entropy"],
        "max_depth": range(1, 11),
        "min_samples_split": range(2, 21),
        "min_samples_leaf": range(1, 21),
    },
}

Optimizer = RandomSearchOptimizer(search_config, n_iter=300, n_jobs=-1, verbosity=0)

Transfer-Learning

In the current implementation transfer-learning works by using a predefined model (with optional pretrained weights) provided by the keras package. The import path can be inserted as a layer (with its parameters in an sub-dictionary), like in a regular search dictionary. The following snippet provides an example:

Transfer-learning example:

from keras.datasets import cifar10
from keras.utils import to_categorical

from hyperactive import SimulatedAnnealingOptimizer

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

y_train = to_categorical(y_train)
y_test = to_categorical(y_test)


# this defines the structure of the model and the search space in each layer
search_config = {
    "keras.compile.0": {"loss": ["binary_crossentropy"], "optimizer": ["adam"]},
    "keras.fit.0": {"epochs": [1], "batch_size": [300], "verbose": [0]},
    # just add the pretrained model as a layer like this:
    "keras.applications.MobileNet.1": {
        "weights": ["imagenet"],
        "input_shape": [(32, 32, 3)],
        "include_top": [False],
    },
    "keras.layers.Flatten.2": {},
    "keras.layers.Dense.3": {
        "units": range(5, 15),
        "activation": ["relu"],
        "kernel_initializer": ["uniform"],
    },
    "keras.layers.Dense.4": {"units": [10], "activation": ["sigmoid"]},
}


Optimizer = SimulatedAnnealingOptimizer(
    search_config, n_iter=3, warm_start=False, verbosity=0
)

Warm-Start

When a search is finished the warm-start-dictionary for the best position in the hyperparameter search space (and its metric) is printed in the command line (at verbosity=1). If multiple searches ran in parallel the warm-start-dictionaries are sorted by the best metric in decreasing order. If the start position in the warm-start-dictionary is not within the search space defined in the search_config an error will occure.

Warm-start example for sklearn model:

start_point = {
    "sklearn.ensemble.RandomForestClassifier.0": {
        "n_estimators": [30],
        "max_depth": [6],
        "criterion": ["entropy"],
        "min_samples_split": [12],
        "min_samples_leaf": [16],
    },
    "sklearn.ensemble.RandomForestClassifier.1": {
        "n_estimators": [50],
        "max_depth": [3],
        "criterion": ["entropy"],
    },
}

Warm-start example for keras model (cnn):

start_point = {
    "keras.compile.0": {"loss": ["categorical_crossentropy"], "optimizer": ["adam"]},
    "keras.fit.0": {"epochs": [3], "batch_size": [500], "verbose": [0]},
    "keras.layers.Conv2D.1": {
        "filters": [64],
        "kernel_size": [3],
        "activation": ["relu"],
        "input_shape": [(28, 28, 1)],
    },
    "keras.layers.MaxPooling2D.2": {"pool_size": [(2, 2)]},
    "keras.layers.Conv2D.3": {
        "filters": [32],
        "kernel_size": [3],
        "activation": ["relu"],
        "input_shape": [(28, 28, 1)],
    },
    "keras.layers.MaxPooling2D.4": {"pool_size": [(2, 2)]},
    "keras.layers.Conv2D.5": {
        "filters": [32],
        "kernel_size": [3],
        "activation": ["relu"],
        "input_shape": [(28, 28, 1)],
    },
    "keras.layers.MaxPooling2D.6": {"pool_size": [(2, 2)]},
    "keras.layers.Flatten.7": {},
    "keras.layers.Dense.8": {"units": [50], "activation": ["softmax"]},
    "keras.layers.Dropout.9": {"rate": [0.4]},
"keras.layers.Dense.10": {"units": [10], "activation": ["softmax"]},
}


Hyperactive API

Classes:

HillClimbingOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False, eps=1, r=1e-6)
StochasticHillClimbingOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False)
TabuOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False, eps=1, tabu_memory=[3, 6, 9])

RandomSearchOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False)
RandomRestartHillClimbingOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False, n_restarts=10)
RandomAnnealingOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False, eps=100, t_rate=0.98)

SimulatedAnnealingOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False, eps=1, t_rate=0.98)
StochasticTunnelingOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False, eps=1, t_rate=0.98, n_neighbours=1, gamma=1)
ParallelTemperingOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False, eps=1, t_rate=0.98, n_neighbours=1, system_temps=[0.1, 0.2, 0.01], n_swaps=10)

ParticleSwarmOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False, n_part=4, w=0.5, c_k=0.5, c_s=0.9)
EvolutionStrategyOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False, individuals=10, mutation_rate=0.7, crossover_rate=0.3)

BayesianOptimizer(search_config, n_iter, metric="accuracy", n_jobs=1, cv=5, verbosity=1, random_state=None, warm_start=False, memory=True, scatter_init=False)

General positional argument:

Argument Type Description
search_config dict hyperparameter search space to explore by the optimizer
n_iter int number of iterations to perform

General keyword arguments:

Argument Type Default Description
metric str "accuracy" metric for model evaluation
n_jobs int 1 number of jobs to run in parallel (-1 for maximum)
cv int 5 cross-validation
verbosity int 1 Shows model and metric information
random_state int None The seed for random number generator
warm_start dict None Hyperparameter configuration to start from
memory bool True Stores explored evaluations in a dictionary to save computing time
scatter_init int False Chooses better initial position by training on multiple random positions with smaller training dataset (split into int subsets)

Specific keyword arguments:

Hill Climbing

Argument Type Default Description
eps int 1 epsilon

Stochastic Hill Climbing

Argument Type Default Description
eps int 1 epsilon
r float 1e-6 acceptance factor

Tabu Search

Argument Type Default Description
eps int 1 epsilon
tabu_memory list [3, 6, 9] length of short/mid/long-term memory

Random Restart Hill Climbing

Argument Type Default Description
eps int 1 epsilon
n_restarts int 10 number of restarts

Random Annealing

Argument Type Default Description
eps int 100 epsilon
t_rate float 0.98 cooling rate

Simulated Annealing

Argument Type Default Description
eps int 1 epsilon
t_rate float 0.98 cooling rate

Stochastic Tunneling

Argument Type Default Description
eps int 1 epsilon
t_rate float 0.98 cooling rate
gamma float 1 tunneling factor

Parallel Tempering

Argument Type Default Description
eps int 1 epsilon
t_rate float 0.98 cooling rate
system_temps list [0.1, 0.2, 0.01] initial temperatures (number of elements defines number of systems)
n_swaps int 10 number of swaps

Particle Swarm Optimization

Argument Type Default Description
n_part int 1 number of particles
w float 0.5 intertia factor
c_k float 0.8 cognitive factor
c_s float 0.9 social factor

Evolution Strategy

Argument Type Default Description
individuals int 10 number of individuals
mutation_rate float 0.7 mutation rate
crossover_rate float 0.3 crossover rate

Bayesian Optimization

Argument Type Default Description
kernel class Matern Kernel used for the gaussian process

General methods:

fit(self, X_train, y_train)
Argument Type Description
X_train array-like training input features
y_train array-like training target
predict(self, X_test)
Argument Type Description
X_test array-like testing input features
score(self, X_test, y_test)
Argument Type Description
X_test array-like testing input features
y_test array-like true values
export(self, filename)
Argument Type Description
filename str file name and path for model export

Available Metrics:

Scikit-learn

Machine Learning Scores Machine Learning Losses
accuracy_score brier_score_loss
balanced_accuracy_score log_loss
average_precision_score max_error
f1_score mean_absolute_error
recall_score mean_squared_error
jaccard_score mean_squared_log_error
roc_auc_score median_absolute_error
explained_variance_score

Keras

Deep Learning Scores Deep Learning Losses
accuracy mean_squared_error
binary_accuracy mean_absolute_error
categorical_accuracy mean_absolute_percentage_error
sparse_categorical_accuracy mean_squared_logarithmic_error
top_k_categorical_accuracy squared_hinge
sparse_top_k_categorical_accuracy hinge
categorical_hinge
logcosh
categorical_crossentropy
sparse_categorical_crossentropy
binary_crossentropy
kullback_leibler_divergence
poisson
cosine_proximity

License

https://github.com/SimonBlanke/Hyperactive/blob/master/LICENSE

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

hyperactive-0.4.1.1-py3-none-any.whl (44.9 kB view details)

Uploaded Python 3

File details

Details for the file hyperactive-0.4.1.1-py3-none-any.whl.

File metadata

  • Download URL: hyperactive-0.4.1.1-py3-none-any.whl
  • Upload date:
  • Size: 44.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.8

File hashes

Hashes for hyperactive-0.4.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 dbdc6a40b0de5cd001b325eaf33074cdd099ad1680900f1aae13ebdcf5963025
MD5 2d74ce067a252e2dd12990fcd1311870
BLAKE2b-256 40b380f848fe6125fc2a876b0b9f634c400ef6c450bdf7f82df329b623453cb0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page