Skip to main content

package for easy crossvalidation

Project description

CrossPredict

PyPI version Documentation Status

  • The library makes cross validation and reports generation easy
  • Easy to extend to other models
  • Supports Lightgbm, XGBoost, CatBoost
  • Supports different crossvalidation strategies:
    • Supports crossvalidation by users (RepeatedKFold)
    • Supports stratified crossvalidation by target column (RepeatedStratifiedKFold)
    • Supports simple crossvalidation (RepeatedKFold)
  • Easy use of target encoding with double crossvalidation
  • Supports target encoding library category_encoders
  • ML Pipeline building blocks

Table of contents

Installation

python -m pip install crosspredict

ReportsPreview

#create report object
a = ReportBinary()
a.plot_report(
    df,
    report_shape = (5,2),
    report={'Roc-Auc':  {'loc':(0, 0)},
          'Precision-Recall': [{'loc':(0, 1)}],
          'MeanTarget-by-Probability': [{'loc':(1, 0)},{'loc':(1, 1)}],
          'Gini-by-Generations': {'loc':(2,0), 'colspan':2},
          'MeanTarget-by-Generations': {'loc':(3,0), 'colspan':2},
          'Probability-Distribution': [{'loc':(4,0)},{'loc':(4,1)}]},
    cols_score = ['result_egr_to_one','probability'],
    cols_target = ['target','target2'],
    col_generation_deals='first_dt_no_comm_mon'
)
a.fig.savefig('report1.png')

Report1

a.plot_report(report_shape = (3,2),
              report={'roc-auc':  {'loc':(0, 0)},
                      'precision-recall': {'loc':(0, 1)},
                      'mean-prob': {'loc':(1, 0)},
                      'gen-gini': [{'loc':(2,0), 'colspan':2}],
                      'distribution': {'loc':(1,1)}
                     },
              cols_score=['probability']) 
a.fig.savefig('report2.png')

Report2

Target Encoding with DoubleCrossValidation

# creates iterator
iter_df = Iterator(n_repeats=3,
                    n_splits=10,
                    random_state = 0,
                    col_client = 'userid',
                    cv_byclient=True)

# fits target encoder (creates mappings for each fold)
cross_encoder = CrossTargetEncoder(iterator = iter_df,
                                    encoder_class=WOEEncoder,
                                    n_splits= 5,
                                    n_repeats= 3,
                                    random_state= 0,
                                    col_client= 'userid',
                                    cv_byclient= True,
                                    col_encoded= 'goal1',
                                    cols= ['field3','field2','field11','field23','field18','field20']
                                  )
cross_encoder.fit(train)

# train cross validation models
model_class = CrossLightgbmModel(iterator=iter_df, 
                                 feature_name=feature_name,
                                 params=params,
                                 cols_cat = ['field3', 'field2', 'field11', 'field23', 'field18', 'field20'],
                                 num_boost_round = 9999,
                                 early_stopping_rounds = 50,
                                 valid = True,
                                 random_state = 0,
                                 col_target = 'goal1',
                                 cross_target_encoder = cross_encoder)
result = model_class.fit(train)

How to use

ML Pipeline Calculation - ML_Pipeline.ipynb
Configuration file for pipeline - params.yaml

Plot_Reports_for_Binary_Classification_problem_example - Plot_Reports_for_Binary_Classification_problem_example.ipynb

Simple_example_in_one_Notebook - Simple_example_in_one_Notebook.ipynb

Iterator_class - Iterator_class.ipynb

CrossModelFabric_class - CrossModelFabric_class.ipynb

CrossTargetEncoder_class - CrossTargetEncoder_class.ipynb

Authors

Vladislav Boyadzhi Alexander Tarelkin

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crosspredict-1.1.11.tar.gz (64.4 kB view details)

Uploaded Source

Built Distribution

crosspredict-1.1.11-py3-none-any.whl (83.8 kB view details)

Uploaded Python 3

File details

Details for the file crosspredict-1.1.11.tar.gz.

File metadata

  • Download URL: crosspredict-1.1.11.tar.gz
  • Upload date:
  • Size: 64.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.7

File hashes

Hashes for crosspredict-1.1.11.tar.gz
Algorithm Hash digest
SHA256 75558f7fb93676f3bae398fdf1bcc545ecb1639f154f2e65b2512f0e3c46040d
MD5 a97f2612b303aa7a91657487af0cd2e3
BLAKE2b-256 6bb61fb3c16c8d531c28620d0b8a97c9ede071e3bda776a5c4c2dfc93251d98f

See more details on using hashes here.

File details

Details for the file crosspredict-1.1.11-py3-none-any.whl.

File metadata

File hashes

Hashes for crosspredict-1.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 80313849307520fd49197cb80f2cc25c31fe773016cf9e0a939f4cc0074fb884
MD5 180d28eaf89d7ae6279fc209db225679
BLAKE2b-256 a4573d0e5c81e0294d1e6036ff36324165cb09c9c0e709c61844f0b97c37397d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page