A wrapper for easy plots of learning and validation curves
Project description
sk-modelcurves
A Python wrapper built for software engineers and researchers to facilitate easy creation of learning and validation curve plots from scikit-learn.
The module is meant to complement your workflow in scikit-learn and ease the process of evaluating your models.
The module includes many quality of life features that should save you precious time whenever you want to plot a learning curve to check for bias/variance or plot a validation curve to see the effect of tuning a hyperparameter.
Background
For those not familiar with learning curves, check out Andrew Ng’s excellent discussion of their use at http://cs229.stanford.edu/materials/ML-advice.pdf
Over the process of writing many research papers and building many models, I found myself using boilerplate code that I would copy paste for almost every project whenever I wanted to plot a learning curve or validation curve to evaluate models.
Hopefully, this module will save you a few minutes each time you need to plot a learning or validation curve so you can focus on other things.
Install
Python’s pip is the recommended method of installation. From the terminal:
$ pip install sk_modelcurves
Example Usage
Generate a learning curve using accuracy as a metric and 5-fold cross validation.
Assumes a sklearn estimator called knn, training data matrix called X and training labels called y:
$ from sk_modelcurves.learning_curve import draw_learning_curve $ draw_learning_curve(knn, X, y, scoring='accuracy', cv=5) $ plt.show()
Generate multiple learning curves for several estimators using F1 score as a metric, 5-fold cross validation, and names for each of the estimators.
Assumes 3 sklearn estimators called knn2, knn20, knn40, training data matrix called X and training labels called y:
$ from sk_modelcurves.learning_curve import draw_learning_curve $ draw_learning_curve([knn2, knn20, knn40], X, y, scoring='f1', cv=5, estimator_titles=['2 Neighbors', '20 Neighbors', '40 Neighbors']) $ plt.show()
Many other options are available. Check out the source code docstrings or the upcoming documentation.
Important Links
Official source code repo: https://github.com/MasonGallo/sk-modelcurve
HTML documentation: coming soon!
Issue tracker: https://github.com/MasonGallo/sk-modelcurve/issues
Dependencies
sk-modelcurves is tested to work for Python 2.6 and Python 2.7. Python 3.3+ has not been tested and is assumed to not work until tested.
The required dependencies include scikit-learn (of course!), numpy >= 1.6.1, and matplotlib >= 1.1.1.
To run tests, you will need nose >= 1.1.2.
Contributing
Anyone is welcome!
If you find a bug or would like to discuss a potential feature, please file an issue first.
Testing
After installation, you can launch the test suite from outside the source directory (you will need to have the nose package installed):
$ nosetests -v sk_modelcurves
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sk_modelcurves-0.4.tar.gz
.
File metadata
- Download URL: sk_modelcurves-0.4.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
13d1bb6322e6b626fd22ca1ec96701ee786ae9bdcd0bd4b2a426fc57bad4ab1e
|
|
MD5 |
c9788efa2168e75c1162e7ddeb825fc8
|
|
BLAKE2b-256 |
b7bc9b1b2ce082ee62263f057599bba6a368b1c2379040c3bf7cf4cc1934a043
|
File details
Details for the file sk_modelcurves-0.4-py2-none-any.whl
.
File metadata
- Download URL: sk_modelcurves-0.4-py2-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 2
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
72826de4b6b9ef969cc5040f4e54270dc9b5db43ddb14ee628b2dcc087d66c23
|
|
MD5 |
23a4c5a6e8daf91b3d24e80675b7e574
|
|
BLAKE2b-256 |
5e932853444dcc4cb7bb52c08c45ff743e3dab8ae9b5f01d02bcf65b8c9d552e
|