package for easy crossvalidation
Project description
CrossPredict
- The library makes cross validation and reports generation easy
- Easy to extend to other models
- Supports Lightgbm, XGBoost, CatBoost
- Supports different crossvalidation strategies:
- Supports crossvalidation by users (RepeatedKFold)
- Supports stratified crossvalidation by target column (RepeatedStratifiedKFold)
- Supports simple crossvalidation (RepeatedKFold)
- Easy use of target encoding with double crossvalidation
- Supports target encoding library category_encoders
- ML Pipeline building blocks
Table of contents
Installation
python -m pip install crosspredict
ReportsPreview
#create report object
a = ReportBinary()
a.plot_report(
df,
report_shape = (5,2),
report={'Roc-Auc': {'loc':(0, 0)},
'Precision-Recall': [{'loc':(0, 1)}],
'MeanTarget-by-Probability': [{'loc':(1, 0)},{'loc':(1, 1)}],
'Gini-by-Generations': {'loc':(2,0), 'colspan':2},
'MeanTarget-by-Generations': {'loc':(3,0), 'colspan':2},
'Probability-Distribution': [{'loc':(4,0)},{'loc':(4,1)}]},
cols_score = ['result_egr_to_one','probability'],
cols_target = ['target','target2'],
col_generation_deals='first_dt_no_comm_mon'
)
a.fig.savefig('report1.png')
a.plot_report(report_shape = (3,2),
report={'roc-auc': {'loc':(0, 0)},
'precision-recall': {'loc':(0, 1)},
'mean-prob': {'loc':(1, 0)},
'gen-gini': [{'loc':(2,0), 'colspan':2}],
'distribution': {'loc':(1,1)}
},
cols_score=['probability'])
a.fig.savefig('report2.png')
Target Encoding with DoubleCrossValidation
# creates iterator
iter_df = Iterator(n_repeats=3,
n_splits=10,
random_state = 0,
col_client = 'userid',
cv_byclient=True)
# fits target encoder (creates mappings for each fold)
cross_encoder = CrossTargetEncoder(iterator = iter_df,
encoder_class=WOEEncoder,
n_splits= 5,
n_repeats= 3,
random_state= 0,
col_client= 'userid',
cv_byclient= True,
col_encoded= 'goal1',
cols= ['field3','field2','field11','field23','field18','field20']
)
cross_encoder.fit(train)
# train cross validation models
model_class = CrossLightgbmModel(iterator=iter_df,
feature_name=feature_name,
params=params,
cols_cat = ['field3', 'field2', 'field11', 'field23', 'field18', 'field20'],
num_boost_round = 9999,
early_stopping_rounds = 50,
valid = True,
random_state = 0,
col_target = 'goal1',
cross_target_encoder = cross_encoder)
result = model_class.fit(train)
How to use
ML Pipeline Calculation - ML_Pipeline.ipynb
Configuration file for pipeline - params.yaml
Plot_Reports_for_Binary_Classification_problem_example - Plot_Reports_for_Binary_Classification_problem_example.ipynb
Simple_example_in_one_Notebook - Simple_example_in_one_Notebook.ipynb
Iterator_class - Iterator_class.ipynb
CrossModelFabric_class - CrossModelFabric_class.ipynb
CrossTargetEncoder_class - CrossTargetEncoder_class.ipynb
Authors
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
crosspredict-1.1.11.tar.gz
(64.4 kB
view details)
Built Distribution
File details
Details for the file crosspredict-1.1.11.tar.gz
.
File metadata
- Download URL: crosspredict-1.1.11.tar.gz
- Upload date:
- Size: 64.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75558f7fb93676f3bae398fdf1bcc545ecb1639f154f2e65b2512f0e3c46040d |
|
MD5 | a97f2612b303aa7a91657487af0cd2e3 |
|
BLAKE2b-256 | 6bb61fb3c16c8d531c28620d0b8a97c9ede071e3bda776a5c4c2dfc93251d98f |
File details
Details for the file crosspredict-1.1.11-py3-none-any.whl
.
File metadata
- Download URL: crosspredict-1.1.11-py3-none-any.whl
- Upload date:
- Size: 83.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80313849307520fd49197cb80f2cc25c31fe773016cf9e0a939f4cc0074fb884 |
|
MD5 | 180d28eaf89d7ae6279fc209db225679 |
|
BLAKE2b-256 | a4573d0e5c81e0294d1e6036ff36324165cb09c9c0e709c61844f0b97c37397d |