Skip to main content

A wrapper for easy plots of learning and validation curves

Project description

sk-modelcurves

A Python wrapper built for software engineers and researchers to facilitate easy creation of learning and validation curve plots from scikit-learn.

The module is meant to complement your workflow in scikit-learn and ease the process of evaluating your models.

The module includes many quality of life features that should save you precious time whenever you want to plot a learning curve to check for bias/variance or plot a validation curve to see the effect of tuning a hyperparameter.

Background

For those not familiar with learning curves, check out Andrew Ng’s excellent discussion of their use at http://cs229.stanford.edu/materials/ML-advice.pdf

Over the process of writing many research papers and building many models, I found myself using boilerplate code that I would copy paste for almost every project whenever I wanted to plot a learning curve or validation curve to evaluate models.

Hopefully, this module will save you a few minutes each time you need to plot a learning or validation curve so you can focus on other things.

Install

Python’s pip is the recommended method of installation. From the terminal:

$ pip install sk_modelcurves

Example Usage

Generate a learning curve using accuracy as a metric and 5-fold cross validation.

Assumes a sklearn estimator called knn, training data matrix called X and training labels called y:

$ from sk_modelcurves.learning_curve import draw_learning_curve
$ draw_learning_curve(knn, X, y, scoring='accuracy', cv=5)
$ plt.show()

Generate multiple learning curves for several estimators using F1 score as a metric, 5-fold cross validation, and names for each of the estimators.

Assumes 3 sklearn estimators called knn2, knn20, knn40, training data matrix called X and training labels called y:

$ from sk_modelcurves.learning_curve import draw_learning_curve
$ draw_learning_curve([knn2, knn20, knn40], X, y, scoring='f1', cv=5,
  estimator_titles=['2 Neighbors', '20 Neighbors', '40 Neighbors'])
$ plt.show()

Many other options are available. Check out the source code docstrings or the upcoming documentation.

Dependencies

sk-modelcurves is tested to work for Python 2.6 and Python 2.7. Python 3.3+ has not been tested and is assumed to not work until tested.

The required dependencies include scikit-learn (of course!), numpy >= 1.6.1, and matplotlib >= 1.1.1.

To run tests, you will need nose >= 1.1.2.

Contributing

Anyone is welcome!

If you find a bug or would like to discuss a potential feature, please file an issue first.

Testing

After installation, you can launch the test suite from outside the source directory (you will need to have the nose package installed):

$ nosetests -v sk_modelcurves

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sk_modelcurves-0.3.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sk_modelcurves-0.3-py2-none-any.whl (8.0 kB view details)

Uploaded Python 2

File details

Details for the file sk_modelcurves-0.3.tar.gz.

File metadata

  • Download URL: sk_modelcurves-0.3.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for sk_modelcurves-0.3.tar.gz
Algorithm Hash digest
SHA256 6c557b8c315da93234b845e887171ea340b808f06ea4b2746f22f4dfa68c9275
MD5 257cc65250354bffe87357ba2398a3ec
BLAKE2b-256 c0f1dae0e6e5231e1a8794c5521f86b40d0b819113943eaf3ae362312da536f3

See more details on using hashes here.

File details

Details for the file sk_modelcurves-0.3-py2-none-any.whl.

File metadata

File hashes

Hashes for sk_modelcurves-0.3-py2-none-any.whl
Algorithm Hash digest
SHA256 9acb90e19f25937cff985078ce0c53dc8ee4844f86851ae68dfdf51f23774cf0
MD5 7f93033e552305866cce479cc5bdd16b
BLAKE2b-256 694f855fb41a3b535e6ce137830ac4e08a42ff488d1c353537fa404348a21bc7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page