package for easy crossvalidation
Project description
CrossPredict
- The library makes cross validation and reports generation easy
- Easy to extend to other models
- Supports Lightgbm, XGBoost, CatBoost
- Supports different crossvalidation strategies:
- Supports crossvalidation by users (RepeatedKFold)
- Supports stratified crossvalidation by target column (RepeatedStratifiedKFold)
- Supports simple crossvalidation (RepeatedKFold)
- Easy use of target encoding with double crossvalidation
- Supports target encoding library category_encoders
- ML Pipeline building blocks
Table of contents
Installation
python -m pip install crosspredict
ReportsPreview
#create report object
a = ReportBinary()
a.plot_report(
df,
report_shape = (5,2),
report={'Roc-Auc': {'loc':(0, 0)},
'Precision-Recall': [{'loc':(0, 1)}],
'MeanTarget-by-Probability': [{'loc':(1, 0)},{'loc':(1, 1)}],
'Gini-by-Generations': {'loc':(2,0), 'colspan':2},
'MeanTarget-by-Generations': {'loc':(3,0), 'colspan':2},
'Probability-Distribution': [{'loc':(4,0)},{'loc':(4,1)}]},
cols_score = ['result_egr_to_one','probability'],
cols_target = ['target','target2'],
col_generation_deals='first_dt_no_comm_mon'
)
a.fig.savefig('report1.png')
a.plot_report(report_shape = (3,2),
report={'roc-auc': {'loc':(0, 0)},
'precision-recall': {'loc':(0, 1)},
'mean-prob': {'loc':(1, 0)},
'gen-gini': [{'loc':(2,0), 'colspan':2}],
'distribution': {'loc':(1,1)}
},
cols_score=['probability'])
a.fig.savefig('report2.png')
Target Encoding with DoubleCrossValidation
# creates iterator
iter_df = Iterator(n_repeats=3,
n_splits=10,
random_state = 0,
col_client = 'userid',
cv_byclient=True)
# fits target encoder (creates mappings for each fold)
cross_encoder = CrossTargetEncoder(iterator = iter_df,
encoder_class=WOEEncoder,
n_splits= 5,
n_repeats= 3,
random_state= 0,
col_client= 'userid',
cv_byclient= True,
col_encoded= 'goal1',
cols= ['field3','field2','field11','field23','field18','field20']
)
cross_encoder.fit(train)
# train cross validation models
model_class = CrossLightgbmModel(iterator=iter_df,
feature_name=feature_name,
params=params,
cols_cat = ['field3', 'field2', 'field11', 'field23', 'field18', 'field20'],
num_boost_round = 9999,
early_stopping_rounds = 50,
valid = True,
random_state = 0,
col_target = 'goal1',
cross_target_encoder = cross_encoder)
result = model_class.fit(train)
How to use
ML Pipeline Calculation - ML_Pipeline.ipynb
Configuration file for pipeline - params.yaml
Plot_Reports_for_Binary_Classification_problem_example - Plot_Reports_for_Binary_Classification_problem_example.ipynb
Simple_example_in_one_Notebook - Simple_example_in_one_Notebook.ipynb
Iterator_class - Iterator_class.ipynb
CrossModelFabric_class - CrossModelFabric_class.ipynb
CrossTargetEncoder_class - CrossTargetEncoder_class.ipynb
Authors
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crosspredict-1.1.11.tar.gz.
File metadata
- Download URL: crosspredict-1.1.11.tar.gz
- Upload date:
- Size: 64.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75558f7fb93676f3bae398fdf1bcc545ecb1639f154f2e65b2512f0e3c46040d
|
|
| MD5 |
a97f2612b303aa7a91657487af0cd2e3
|
|
| BLAKE2b-256 |
6bb61fb3c16c8d531c28620d0b8a97c9ede071e3bda776a5c4c2dfc93251d98f
|
File details
Details for the file crosspredict-1.1.11-py3-none-any.whl.
File metadata
- Download URL: crosspredict-1.1.11-py3-none-any.whl
- Upload date:
- Size: 83.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80313849307520fd49197cb80f2cc25c31fe773016cf9e0a939f4cc0074fb884
|
|
| MD5 |
180d28eaf89d7ae6279fc209db225679
|
|
| BLAKE2b-256 |
a4573d0e5c81e0294d1e6036ff36324165cb09c9c0e709c61844f0b97c37397d
|