Skip to main content

A machine learning package only for tree based models

Project description

mltree

Install

pip install mltree

How to use

First, load the analytical base table:

from mltree.train import train_tree_models
import pandas as pd
from pathlib import Path

path = Path('..')
datasets_path = path/'datasets'

df = pd.read_csv(datasets_path/'churn_abt.csv')
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
data_ref_safra seller_id uf tot_orders_12m tot_items_12m tot_items_dist_12m receita_12m recencia nao_revendeu_next_6m
0 2018-01-01 0015a82c2db000af6aaaf3ae2ecb0532 SP 3 3 1 2685.00 74 1
1 2018-01-01 001cca7ae9ae17fb1caed9dfb1094831 ES 171 207 9 21275.23 2 0
2 2018-01-01 002100f778ceb8431b7a1020ff7ab48f SP 38 42 15 781.80 2 0
3 2018-01-01 003554e2dce176b5555353e4f3555ac8 GO 1 1 1 120.00 16 1
4 2018-01-01 004c9cd9d87a3c30c522c48c4fc07416 SP 130 141 75 16228.88 8 0

Split into train and test or out of time datasets:

df_train = df.query('data_ref_safra < "2018-03-01"')
df_oot = df.query('data_ref_safra == "2018-03-01"')

Get features metadata and types:

key_vars = ['data_ref_safra', 'seller_id']
target = 'nao_revendeu_next_6m'
num_vars = [ var for var in df.select_dtypes(include='number').columns.tolist() if var not in [target] ]
cat_vars = [var for var in df.select_dtypes(exclude='number').columns.tolist() if var not in key_vars]

Train based tree models:

train_tree_models(df_train, df_oot, target=target, folds=5, cat_features=cat_vars, num_features=num_vars, seed=42)
{'dt': {'auc': {'train': 0.9139680595991275, 'test': 0.8968114296299949}},
 'rf': {'auc': {'train': 0.9072972070544887, 'test': 0.8964968670043654}}}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mltree-0.0.2.tar.gz (9.0 kB view hashes)

Uploaded Source

Built Distribution

mltree-0.0.2-py3-none-any.whl (8.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page