Skip to main content

A machine learning package only for tree based models

Project description

mltree

Install

pip install mltree

How to use

First, load the analytical base table:

from mltree.train import train_tree_models
import pandas as pd
from pathlib import Path

path = Path('..')
datasets_path = path/'datasets'

df = pd.read_csv(datasets_path/'churn_abt.csv')
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
data_ref_safra seller_id uf tot_orders_12m tot_items_12m tot_items_dist_12m receita_12m recencia nao_revendeu_next_6m
0 2018-01-01 0015a82c2db000af6aaaf3ae2ecb0532 SP 3 3 1 2685.00 74 1
1 2018-01-01 001cca7ae9ae17fb1caed9dfb1094831 ES 171 207 9 21275.23 2 0
2 2018-01-01 002100f778ceb8431b7a1020ff7ab48f SP 38 42 15 781.80 2 0
3 2018-01-01 003554e2dce176b5555353e4f3555ac8 GO 1 1 1 120.00 16 1
4 2018-01-01 004c9cd9d87a3c30c522c48c4fc07416 SP 130 141 75 16228.88 8 0

Split into train and test or out of time datasets:

df_train = df.query('data_ref_safra < "2018-03-01"')
df_oot = df.query('data_ref_safra == "2018-03-01"')

Get features metadata and types:

key_vars = ['data_ref_safra', 'seller_id']
target = 'nao_revendeu_next_6m'
num_vars = [ var for var in df.select_dtypes(include='number').columns.tolist() if var not in [target] ]
cat_vars = [var for var in df.select_dtypes(exclude='number').columns.tolist() if var not in key_vars]

Train based tree models:

train_tree_models(df_train, df_oot, target=target, folds=5, cat_features=cat_vars, num_features=num_vars, seed=42)
{'dt': {'auc': {'train': 0.9139680595991275, 'test': 0.8968114296299949}},
 'rf': {'auc': {'train': 0.9072972070544887, 'test': 0.8964968670043654}}}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mltree-0.0.2.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mltree-0.0.2-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file mltree-0.0.2.tar.gz.

File metadata

  • Download URL: mltree-0.0.2.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for mltree-0.0.2.tar.gz
Algorithm Hash digest
SHA256 25040ba65e448748f6f03b4bc5637c1f58dd8066202ae480690be5e3d692c330
MD5 a3a83f5db20a356b776624f7049a7aa0
BLAKE2b-256 6f05998e7215c35e42a08c6c97362e558e6fe4f8a3848fbd03d5c9e6fd4a4435

See more details on using hashes here.

File details

Details for the file mltree-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: mltree-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 8.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for mltree-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0ca4094def8a7ab75145f5083141d7f3676738e3d30704a4ebe9c1b17db6feab
MD5 f1f24afc25dea9199e09b22bcfe174da
BLAKE2b-256 c6071c2d8c27a8aabce45ad20a14b80b0417c6d1de1a42794d6b7449a7d2ae62

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page