Skip to main content

A wrapper for easy plots of learning and validation curves

Project description

sk-modelcurves

A Python wrapper built for software engineers and researchers to facilitate easy creation of learning and validation curve plots from scikit-learn.

The module is meant to complement your workflow in scikit-learn and ease the process of evaluating your models.

The module includes many quality of life features that should save you precious time whenever you want to plot a learning curve to check for bias/variance or plot a validation curve to see the effect of tuning a hyperparameter.

Background

For those not familiar with learning curves, check out Andrew Ng’s excellent discussion of their use at http://cs229.stanford.edu/materials/ML-advice.pdf

Over the process of writing many research papers and building many models, I found myself using boilerplate code that I would copy paste for almost every project whenever I wanted to plot a learning curve or validation curve to evaluate models.

Hopefully, this module will save you a few minutes each time you need to plot a learning or validation curve so you can focus on other things.

Install

Python’s pip is the recommended method of installation. From the terminal:

$ pip install sk_modelcurves

Example Usage

Generate a learning curve using accuracy as a metric and 5-fold cross validation.

Assumes a sklearn estimator called knn, training data matrix called X and training labels called y:

$ from sk_modelcurves.learning_curve import draw_learning_curve
$ draw_learning_curve(knn, X, y, scoring='accuracy', cv=5)
$ plt.show()

Generate multiple learning curves for several estimators using F1 score as a metric, 5-fold cross validation, and names for each of the estimators.

Assumes 3 sklearn estimators called knn2, knn20, knn40, training data matrix called X and training labels called y:

$ from sk_modelcurves.learning_curve import draw_learning_curve
$ draw_learning_curve([knn2, knn20, knn40], X, y, scoring='f1', cv=5,
  estimator_titles=['2 Neighbors', '20 Neighbors', '40 Neighbors'])
$ plt.show()

Many other options are available. Check out the source code docstrings or the upcoming documentation.

Dependencies

sk-modelcurves is tested to work for Python 2.6 and Python 2.7. Python 3.3+ has not been tested and is assumed to not work until tested.

The required dependencies include scikit-learn (of course!), numpy >= 1.6.1, and matplotlib >= 1.1.1.

To run tests, you will need nose >= 1.1.2.

Contributing

Anyone is welcome!

If you find a bug or would like to discuss a potential feature, please file an issue first.

Testing

After installation, you can launch the test suite from outside the source directory (you will need to have the nose package installed):

$ nosetests -v sk_modelcurves

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sk_modelcurves-0.2.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sk_modelcurves-0.2-py2-none-any.whl (7.7 kB view details)

Uploaded Python 2

File details

Details for the file sk_modelcurves-0.2.tar.gz.

File metadata

  • Download URL: sk_modelcurves-0.2.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for sk_modelcurves-0.2.tar.gz
Algorithm Hash digest
SHA256 879237e2edcc2ba172ab8e3383bc4ae3ac6e034042dce9d7669cc80091de175d
MD5 cc141d160e05d9a5bb2122bc929a6e07
BLAKE2b-256 b18d47bb3cd97e2a64461ff474e4cdf29eb8ff58e441a6e8980b8f08f3605d9c

See more details on using hashes here.

File details

Details for the file sk_modelcurves-0.2-py2-none-any.whl.

File metadata

File hashes

Hashes for sk_modelcurves-0.2-py2-none-any.whl
Algorithm Hash digest
SHA256 4db2d18fd273af06930798b61bd5388a728fbc3835d2a39d87f50cd832e5cebe
MD5 a255e4d1e45132b1e6efc77a7d711fdf
BLAKE2b-256 6d716d46f74e68bc7726b63d3b200064a23ee51864d5994919b6de25375f1c1b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page