Skip to main content

A hyperopt wrapper - simplifying hyperparameter tuning with Scikit-learn style estimators.

Project description

Skperopt

A hyperopt wrapper - simplifying hyperparameter tuning with Scikit-learn style estimators.

Works with either classification evaluation metrics "f1", "auc" or "accuracy" AND regression "rmse" and "mse".

Installation:

pip install skperopt

Usage:

Just pass in an estimator, a parameter grid and Hyperopt will do the rest. No need to define objectives or write hyoperopt specific parameter grids.

Recipe (vanilla flavour):

  • Import skperopt
  • Initalize skperopt
  • Run skperopt.HyperSearch.search
  • Collect the results

Code example below.

import skperopt as sk

import pandas as pd

from sklearn.datasets import make_classification
from sklearn.neighbors import KNeighborsClassifier

#generate classification data
data = make_classification(n_samples=1000, n_features=10, n_classes=2)
X = pd.DataFrame(data[0])
y = pd.DataFrame(data[1])

#init the classifier
kn = KNeighborsClassifier()
param = {"n_neighbors": [int(x) for x in np.linspace(1, 60, 30)],
         "leaf_size": [int(x) for x in np.linspace(1, 60, 30)],
         "p": [1, 2, 3, 4, 5, 10, 20],
         "algorithm": ['auto', 'ball_tree', 'kd_tree', 'brute'],
         "weights": ["uniform", "distance"]}


#search parameters
search = sk.HyperSearch(kn, X, y, params=param)
search.search()

#gather and apply the best parameters
kn.set_params(**search.best_params)

#view run results
print(search.stats)

HyperSearch parameters

  • est ([sklearn estimator] required)

any sklearn style estimator

  • X ([pandas Dataframe] required)

your training data

  • y ([pandas Dataframe] required)

your training data

  • params ([dictionary] required)

a parameter search grid

  • iters (default 500 [int])

number of iterations to try before early stopping

  • time_to_search (default None [int])

time in seconds to run for before early stopping (None = no time limit)

  • cv (default 5 [int])

number of folds to use in cross_vaidation tests

  • cv_times (default 1 [int])

number of times to perfrom cross validation on a new random sample of the data -higher values decrease variance but increase run time

  • randomState (default 10 [int])

random state for the data shuffling

  • scorer (default "f1" [str])

type of evaluation metric to use - accepts classification "f1","auc","accuracy" or regression "rmse" and "mse"

  • verbose (default 1 [int])

amount of verbosity

     0 = none 

     1 = some 

     2 = debug
  • random (default - False)

should the data be randomized during the cross validation

  • foldtype (default "Kfold" [str])

type of folds to use - accepts "KFold", "Stratified"

HyperSearch methods

  • HyperSearch.search() (None)

Used to search the parameter grid using hyperopt. No parameters need to be passed to the function. All parameters are set during initialization.

Testing

With 100 tests of 150 search iterations for both RandomSearch and Skperopt Searches.

Skperopt (hyperopt) performs better than a RandomSearch, producing higher average f1 score with a smaller standard deviation.

alt chart

Skperopt Search Results

f1 score over 100 test runs:

Mean 0.9340930

Standard deviation 0.0062275

Random Search Results

f1 score over 100 test runs

Mean 0.927461652

Standard deviation 0.0063314


Updates

V0.0.73

  • Added cv_times attr - runs the cross validation n times (ie cv (5x5) ) each iteration on a new randomly sampled data set this should reduce overfitting

V0.0.7

  • Added FIXED RMSE eval metric

  • Added MSE eval metric

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

Skperopt-0.0.74-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file Skperopt-0.0.74-py3-none-any.whl.

File metadata

  • Download URL: Skperopt-0.0.74-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2.post20191203 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.7.3

File hashes

Hashes for Skperopt-0.0.74-py3-none-any.whl
Algorithm Hash digest
SHA256 683f8e255bafa16baf79bacaf3508352abea20bb339f18074a5984f203d439c8
MD5 4d263a8520c7156b28774c824156bf1e
BLAKE2b-256 37050592301c94bf44c116c6a75bb1db949b31c40f36f4f8eba1c58ba4cb2a7b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page