Skip to main content

Functions to work with BVS enriched data

Project description

bvslusa

Install

pip install bvslusa

How to use

import pandas as pd
from bvslusa.validate import remove_restritivos, target_mapping
from bvslusa.evaluate import evaluate_bvs_scores
from bvslusa.ratings import get_ratings
df = pd.read_csv('../data/AVZA_FB727003.csv', sep=';')
df
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
DOC_NUMBER SAFRA QTD_SCPC VL_SCPC QTD_CCF QTD_PROTESTO VL_PROTESTO FLAG_RESTRITIVO SCRCRDMERPJ3 SCRCRDMERPJ4 SCRCRDMERPJ5 SCRCRDATACAD SCRCRDMERMEI PERF_MERC_60D6M_EVER
0 891026 202203 0 0 1 0 0 1 267 256 58 185 5 MAU
1 982383 202203 0 0 0 0 0 0 283 256 460 619 30 BOM
2 176129 202203 0 0 0 0 0 0 283 256 460 712 34 BOM
3 566081 202203 0 0 0 0 0 0 283 256 460 781 30 BOM
4 760613 202203 0 0 0 0 0 0 283 256 460 712 34 BOM
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
35734 776859 202203 0 0 0 4 911 1 484 767 513 969 902 MAU
35735 94325 202203 0 0 0 4 911 1 484 767 513 969 902 MAU
35736 315930 202203 0 0 0 4 911 1 484 767 513 969 902 MAU
35737 668323 202203 0 0 0 4 911 1 484 767 513 969 902 MAU
35738 140483 202203 0 0 0 2 5285 1 487 767 531 956 131 BOM

35739 rows × 14 columns

df = remove_restritivos(df)
df = target_mapping(df, target='PERF_MERC_60D6M_EVER', map_dict={'BOM': 0, 'MAU': 1})
df.pipe(evaluate_bvs_scores)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
score auc ks
0 SCRCRDMERPJ5 0.694770 30.86
1 SCRCRDATACAD 0.666147 28.28
2 SCRCRDMERMEI 0.610453 19.22
3 SCRCRDMERPJ3 0.598126 16.93
4 SCRCRDMERPJ4 0.574876 13.56

As the SCRCRDMERPJ5 is the best bvs score, let’s proceed to build ratings with it:

df_ratings = get_ratings(df, target='PERF_MERC_60D6M_EVER', score='SCRCRDMERPJ5')
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Bin Count Count (%) Non-event Event Event rate WoE IV JS
0 (-inf, 294.50) 3341 0.142900 2709 632 0.189165 -0.892188 0.163145 1.974255e-02
1 [294.50, 357.50) 1336 0.057143 1107 229 0.171407 -0.771947 0.046611 5.685828e-03
2 [357.50, 405.50) 1380 0.059025 1184 196 0.142029 -0.549094 0.022291 2.751878e-03
3 [405.50, 457.50) 1469 0.062831 1299 170 0.115725 -0.314082 0.007055 8.782459e-04
4 [457.50, 491.50) 1406 0.060137 1275 131 0.093172 -0.072129 0.000322 4.028209e-05
5 [491.50, 535.50) 1547 0.066168 1413 134 0.086619 0.007997 0.000004 5.272191e-07
6 [535.50, 570.50) 1189 0.050855 1095 94 0.079058 0.107581 0.000563 7.034663e-05
7 [570.50, 597.50) 1276 0.054577 1200 76 0.059561 0.41171 0.007813 9.698149e-04
8 [597.50, 629.50) 1345 0.057528 1269 76 0.056506 0.467618 0.010386 1.286575e-03
9 [629.50, 682.50) 2340 0.100086 2223 117 0.050000 0.596806 0.027941 3.441700e-03
10 [682.50, 748.50) 3183 0.136142 3068 115 0.036129 0.936216 0.081821 9.869692e-03
11 [748.50, inf) 3568 0.152609 3498 70 0.019619 1.563818 0.202677 2.303251e-02
12 Special 0 0.000000 0 0 0.000000 0.0 0.000000 0.000000e+00
13 Missing 0 0.000000 0 0 0.000000 0.0 0.000000 0.000000e+00
Totals 23380 1.000000 21340 2040 0.087254 0.570628 6.776995e-02

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bvslusa-0.0.2.tar.gz (11.0 kB view hashes)

Uploaded Source

Built Distribution

bvslusa-0.0.2-py3-none-any.whl (10.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page