A machine learning package only for tree based models
Project description
mltree
Install
pip install mltree
How to use
First, load the analytical base table:
from mltree.train import train_tree_models
import pandas as pd
from pathlib import Path
path = Path('..')
datasets_path = path/'datasets'
df = pd.read_csv(datasets_path/'churn_abt.csv')
df.head()
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
data_ref_safra | seller_id | uf | tot_orders_12m | tot_items_12m | tot_items_dist_12m | receita_12m | recencia | nao_revendeu_next_6m | |
---|---|---|---|---|---|---|---|---|---|
0 | 2018-01-01 | 0015a82c2db000af6aaaf3ae2ecb0532 | SP | 3 | 3 | 1 | 2685.00 | 74 | 1 |
1 | 2018-01-01 | 001cca7ae9ae17fb1caed9dfb1094831 | ES | 171 | 207 | 9 | 21275.23 | 2 | 0 |
2 | 2018-01-01 | 002100f778ceb8431b7a1020ff7ab48f | SP | 38 | 42 | 15 | 781.80 | 2 | 0 |
3 | 2018-01-01 | 003554e2dce176b5555353e4f3555ac8 | GO | 1 | 1 | 1 | 120.00 | 16 | 1 |
4 | 2018-01-01 | 004c9cd9d87a3c30c522c48c4fc07416 | SP | 130 | 141 | 75 | 16228.88 | 8 | 0 |
Split into train and test or out of time datasets:
df_train = df.query('data_ref_safra < "2018-03-01"')
df_oot = df.query('data_ref_safra == "2018-03-01"')
Get features metadata and types:
key_vars = ['data_ref_safra', 'seller_id']
target = 'nao_revendeu_next_6m'
num_vars = [ var for var in df.select_dtypes(include='number').columns.tolist() if var not in [target] ]
cat_vars = [var for var in df.select_dtypes(exclude='number').columns.tolist() if var not in key_vars]
Train based tree models:
train_tree_models(df_train, df_oot, target=target, folds=5, cat_features=cat_vars, num_features=num_vars, seed=42)
{'dt': {'auc': {'train': 0.9139680595991275, 'test': 0.8968114296299949}},
'rf': {'auc': {'train': 0.9072972070544887, 'test': 0.8964968670043654}}}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mltree-0.0.2.tar.gz
(9.0 kB
view hashes)
Built Distribution
mltree-0.0.2-py3-none-any.whl
(8.6 kB
view hashes)