Skip to main content

Scikit-lean models hyperparameters tuning, using evolutionary algorithms

Project description

Build Status Codecov PyPI Version Python Version

Sklearn-genetic-opt

scikit-learn models hyperparameters tuning, using evolutionary algorithms.

This is meant to be an alternative from popular methods inside scikit-learn such as Grid Search and Random Grid Search.

Sklearn-genetic-opt uses evolutionary algorithms from the deap package to find the "best" set of hyperparameters that optimizes (max or min) the cross validation scores, it can be used for both regression and classification problems.

Usage:

Install sklearn-genetic-opt

It's advised to install sklearn-genetic using a virtual env, inside the env use:

pip install sklearn-genetic-opt

Example

from sklearn_genetic import GASearchCV
from sklearn_genetic.utils import plot_fitness_evolution
from sklearn_genetic.space import Continuous, Categorical, Integer
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_digits
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt


data = load_digits() 
n_samples = len(data.images)
X = data.images.reshape((n_samples, -1))
y = data['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

clf = RandomForestClassifier()

param_grid = {'min_weight_fraction_leaf': Continuous(0.01, 0.5, distribution='log-uniform'),
              'bootstrap': Categorical([True, False]),
              'max_depth': Integer(2, 30), 
              'max_leaf_nodes': Integer(2, 35), 
              'n_estimators': Integer(100, 300)}

evolved_estimator = GASearchCV(estimator=clf,
                               cv=3,
                               scoring='accuracy',
                               population_size=10,
                               generations=25,
                               tournament_size=3,
                               elitism=True,
                               crossover_probability=0.8,
                               mutation_probability=0.1,
                               param_grid=param_grid,
                               criteria='max',
                               algorithm='eaMuPlusLambda',
                               n_jobs=-1,
                               verbose=True,
                               keep_top_k=4)

# Train and optimize the estimator 
evolved_estimator.fit(X_train,y_train)
# Best parameters found
print(evolved_estimator.best_params)
# Use the model fitted with the best parameters
y_predict_ga = evolved_estimator.predict(X_test)
print(accuracy_score(y_test,y_predict_ga))

# See the evolution of the optimization per generation
plot_fitness_evolution(evolved_estimator)
plt.show()

# Saved metadata for further analysis
print("Stats achieved in each generation: ", evolved_estimator.history)
print("Parameters and cv scores in each iteration: ", evolved_estimator.logbook)
print("Best k solutions: ", evolved_estimator.hof)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn-genetic-opt-0.3.0.tar.gz (12.3 kB view details)

Uploaded Source

Built Distribution

sklearn_genetic_opt-0.3.0-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file sklearn-genetic-opt-0.3.0.tar.gz.

File metadata

  • Download URL: sklearn-genetic-opt-0.3.0.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.8

File hashes

Hashes for sklearn-genetic-opt-0.3.0.tar.gz
Algorithm Hash digest
SHA256 ce4e4a53c7ef25191e4bd084e00958c4dc1157f688ad85b3282751274c58fc2b
MD5 4f453c565a25ee43206217d8bc33e992
BLAKE2b-256 21a29d98eb67857e34023bffceff37cb8ef67e6fb980e22b04bf16490be09bad

See more details on using hashes here.

File details

Details for the file sklearn_genetic_opt-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: sklearn_genetic_opt-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.8

File hashes

Hashes for sklearn_genetic_opt-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dc6e8b2fb5dd708fefbb04338a2b975b9b0c0d04d6535455626e7d96f9866c18
MD5 a13dcdd53f62e88a256b0b90f28628bb
BLAKE2b-256 ba049a98fea7ccb6d1b3f6eb52a9e14e29d3bfef3e07a59c5592ad451f19184f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page