Skip to main content

Sklearn models hyperparameters tuning using evolutionary algorithms

Project description

Build Status Codecov PyPI Version Python Version

Sklearn-genetic-opt

scikit-learn models hyperparameters tuning using evolutionary algorithms.

This is meant to be an alternative from popular methods inside scikit-learn such as Grid Search and Random Grid Search.

Sklearn-genetic-opt uses evolutionary algorithms from the deap package to find the "best" set of hyperparameters that optimizes (max or min) the cross validation scores, it can be used for both regression and classification problems.

Usage:

Install sklearn-genetic-opt

It's advised to install sklearn-genetic using a virtual env, inside the env use:

pip install sklearn-genetic-opt

Example

from sklearn_genetic import GASearchCV
from sklearn_genetic.utils import plot_fitness_evolution
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_digits
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt


data = load_digits() 
y = data['target']
X = data['data'] 

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

clf = DecisionTreeClassifier()

evolved_estimator = GASearchCV(estimator=clf,
                               cv=3,
                               scoring='accuracy',
                               population_size=16,
                               generations=30,
                               tournament_size=3,
                               elitism=True,
                               crossover_probability=0.8,
                               mutation_probability=0.1,
                               continuous_parameters={'min_weight_fraction_leaf': (0, 0.5)},
                               categorical_parameters={'criterion': ['gini', 'entropy']},
                               integer_parameters={'max_depth': (2, 20), 'max_leaf_nodes': (2, 30)},
                               criteria='max',
                               algorithm='eaMuPlusLambda',
                               n_jobs=-1,
                               verbose=True,
                               keep_top_k=4)

# Train and optimize the estimator 
evolved_estimator.fit(X_train,y_train)
# Best parameters found
print(evolved_estimator.best_params)
# Use the model fitted with the best parameters
y_predict_ga = evolved_estimator.predict(X_test)
print(accuracy_score(y_test,y_predict_ga))

# See the evolution of the optimization per generation
plot_fitness_evolution(evolved_estimator)
plt.show()

# Saved metadata for further analysis
print("Stats achieved in each generation: ", evolved_estimator.history)
print("Parameters and cv scores in each iteration: ", evolved_estimator.logbook)
print("Best k solutions: ", evolved_estimator.hof)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn-genetic-opt-0.2.1.dev0.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

sklearn_genetic_opt-0.2.1.dev0-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file sklearn-genetic-opt-0.2.1.dev0.tar.gz.

File metadata

  • Download URL: sklearn-genetic-opt-0.2.1.dev0.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.8

File hashes

Hashes for sklearn-genetic-opt-0.2.1.dev0.tar.gz
Algorithm Hash digest
SHA256 dbd65fd955fb5183a8fb0e0fcf299246f6754d46316915b776ab5b18520f19e9
MD5 a6767d283b6286d3e8e90913468c0cbe
BLAKE2b-256 001de93c321ebb85691a9f8cd25e04ed669461592bd68ae617a3878d0c8297e9

See more details on using hashes here.

File details

Details for the file sklearn_genetic_opt-0.2.1.dev0-py3-none-any.whl.

File metadata

  • Download URL: sklearn_genetic_opt-0.2.1.dev0-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.8

File hashes

Hashes for sklearn_genetic_opt-0.2.1.dev0-py3-none-any.whl
Algorithm Hash digest
SHA256 70ed8298faeeef98ce12ebdca5801ff7417a6974884725a4979cc1716a85c37c
MD5 4b09ebbcf10510f8621ea825a8c10f0e
BLAKE2b-256 26f01bb0be5d3a1415db02bc9681d15e255917bc66bc0101dc7f41bee0580021

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page