Tools for the DataCamp Creating Robust Python Projects course
Project description
The datacamprojects python package
Skip the boilerplate of scikit-learn machine learning examples.
Installation
pip install datacamprojects
Usage
In a shell environment, you can run datacamprojects
with no arguments to perform a Logistic Regression
on the digits dataset.
This will produce a 10 x 10 confusion matrix with the Accuracy Score at the top.
You can also pass arguments to datacamprojects at the command line.
For example,
datacamprojects -dataset diabetes -model linear_model.Lasso
# Or
datacamprojects -d diabetes -m linear_model.Lasso
will run a linear regression with lasso regularization (L1)
on the diabetes dataset.
The dataset argument can be any of
the following built-in scikit-learn datasets:
- Regression
bostondiabetes
- Classification
digitsiriswinebreast_cancer
The model argument refers to the model type and name from scikit-learn.
The first part is the submodule, e.g.
linear_modelnaive_bayesensemblesvm
while the second is what is actually imported, e.g.
LinearRegressionGaussianNBRandomForestRegressorSVC
Simplify code to a single function call per step:
from sklearn.metrics import confusion_matrix, accuracy_score
import datacamprojects as dcp
dataset = dcp.get_data('digits')
x_train, x_test, y_train, y_test = dcp.split_data(dataset)
model = dcp.get_model(model_type='ensemble',
model_name='RandomForestClassifier')
fit = model.fit(x_train, y_train)
dcp.pickle_model(filename='digits_rf.pickle', model=fit)
predictions = fit.predict(x_test)
confmat = confusion_matrix(y_true=y_test, y_pred=predictions)
accuracy = accuracy_score(y_true=y_test, y_pred=predictions)
dcp.confusion_matrix_plot(cm=confmat,
acc=accuracy,
filename='digits_rf.png')
Or run a whole pipeline with one function:
import datacamprojects as dcp
dcp.classification(dataset='digits',
model_type='ensemble',
model_name='RandomForestClassifier',
pickle_name='digits_rf.pickle',
plot_name='digits_rf.png')
For inspiration, look at the example pipeline in the pipeline folder of the datacamprojects repo.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datacamprojects-0.0.1.tar.gz.
File metadata
- Download URL: datacamprojects-0.0.1.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93777a733766b35dde7f8752ef3ed0d4397326920deaa3e21b036e2525043a41
|
|
| MD5 |
d90bab66d434020942dd9414cc6b3a98
|
|
| BLAKE2b-256 |
1c5c0c5b2c742445816c15f8df04f4af5ffed1a260875afd79736c66c83f222a
|
File details
Details for the file datacamprojects-0.0.1-py3-none-any.whl.
File metadata
- Download URL: datacamprojects-0.0.1-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca96954f9dcc32567294a848bd8f199f83c1ed702fd39617efed1672bcdbdba3
|
|
| MD5 |
b563cf96ea8a76a725fddd13ac59674e
|
|
| BLAKE2b-256 |
5b1a4d723cd837a214a13396f5fcb545b07bf3b3532ac3828d1e6ecdadef52c7
|