Tools for the DataCamp Creating Robust Python Projects course
Project description
The datacamprojects
python package
Skip the boilerplate of scikit-learn machine learning examples.
Installation
pip install datacamprojects
Usage
In a shell environment, you can run datacamprojects
with no arguments to perform a Logistic Regression
on the digits
dataset.
This will produce a 10 x 10 confusion matrix with the Accuracy Score at the top.
You can also pass arguments to datacamprojects at the command line.
For example,
datacamprojects -dataset diabetes -model linear_model.Lasso
# Or
datacamprojects -d diabetes -m linear_model.Lasso
will run a linear regression with lasso regularization (L1)
on the diabetes
dataset.
The dataset
argument can be any of
the following built-in scikit-learn datasets:
- Regression
boston
diabetes
- Classification
digits
iris
wine
breast_cancer
The model
argument refers to the model type and name from scikit-learn.
The first part is the submodule, e.g.
linear_model
naive_bayes
ensemble
svm
while the second is what is actually imported, e.g.
LinearRegression
GaussianNB
RandomForestRegressor
SVC
Simplify code to a single function call per step:
from sklearn.metrics import confusion_matrix, accuracy_score
import datacamprojects as dcp
dataset = dcp.get_data('digits')
x_train, x_test, y_train, y_test = dcp.split_data(dataset)
model = dcp.get_model(model_type='ensemble',
model_name='RandomForestClassifier')
fit = model.fit(x_train, y_train)
dcp.pickle_model(filename='digits_rf.pickle', model=fit)
predictions = fit.predict(x_test)
confmat = confusion_matrix(y_true=y_test, y_pred=predictions)
accuracy = accuracy_score(y_true=y_test, y_pred=predictions)
dcp.confusion_matrix_plot(cm=confmat,
acc=accuracy,
filename='digits_rf.png')
Or run a whole pipeline with one function:
import datacamprojects as dcp
dcp.classification(dataset='digits',
model_type='ensemble',
model_name='RandomForestClassifier',
pickle_name='digits_rf.pickle',
plot_name='digits_rf.png')
For inspiration, look at the example pipeline in the pipeline folder of the datacamprojects repo.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file datacamprojects-0.0.1.tar.gz
.
File metadata
- Download URL: datacamprojects-0.0.1.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 93777a733766b35dde7f8752ef3ed0d4397326920deaa3e21b036e2525043a41 |
|
MD5 | d90bab66d434020942dd9414cc6b3a98 |
|
BLAKE2b-256 | 1c5c0c5b2c742445816c15f8df04f4af5ffed1a260875afd79736c66c83f222a |
File details
Details for the file datacamprojects-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: datacamprojects-0.0.1-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ca96954f9dcc32567294a848bd8f199f83c1ed702fd39617efed1672bcdbdba3 |
|
MD5 | b563cf96ea8a76a725fddd13ac59674e |
|
BLAKE2b-256 | 5b1a4d723cd837a214a13396f5fcb545b07bf3b3532ac3828d1e6ecdadef52c7 |