A real tree based ML package
Project description
mltree
This package evovled from the attempt to make right kind of Decision Tress which was ideated by many people like Hastie, Tibshirani, Friedman, Quilan, Loh, Chaudhari.
Install
pip install mltreelib
How to use
Create a sample data
import numpy as np
import pandas as pd
from mltreelib.data import Dataset
from mltreelib.tree import Tree
n_size = 1000
rnd = np.random.RandomState(1234)
dummy_data = pd.DataFrame({'numericfull':rnd.randint(1,500,size=n_size),
'unitint':rnd.randint(1,25,size=n_size),
'floatfull':rnd.random_sample(size=n_size),
'floatsmall':np.round(rnd.random_sample(size=n_size)+rnd.randint(1,25,size=n_size),2),
'categoryobj':rnd.choice(['a','b','c','d'],size=n_size),
'stringobj':rnd.choice(["{:c}".format(k) for k in range(97, 123)],size=n_size)})
dummy_data.head()
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
numericfull | unitint | floatfull | floatsmall | categoryobj | stringobj | |
---|---|---|---|---|---|---|
0 | 304 | 18 | 0.908959 | 8.56 | a | c |
1 | 212 | 24 | 0.348582 | 14.35 | a | g |
2 | 295 | 15 | 0.392977 | 21.98 | a | y |
3 | 54 | 20 | 0.720856 | 5.33 | a | q |
4 | 205 | 21 | 0.897588 | 23.03 | c | k |
Create a Dataset
dataset = Dataset(df=dummy_data)
print(dataset)
print('Pandas Data Frame : ',np.round(dummy_data.memory_usage(deep=True).sum()*1e-6,2),'MB')
print('Dataset Structured Array : ',np.round(dataset.data.nbytes*1e-6/ 1024 * 1024,2),'MB')
dataset.data[:5]
Dataset(df=Shape((1000, 6), reduce_datatype=True, encode_category=None, add_intercept=False, na_treatment=allow, copy=False, digits=None, n_category=None, split_ratio=None)
Pandas Data Frame : 0.15 MB
Dataset Structured Array : 0.03 MB
array([(304, 18, 0.9089594 , 8.56, 'a', 'c'),
(212, 24, 0.34858167, 14.35, 'a', 'g'),
(295, 15, 0.39297667, 21.98, 'a', 'y'),
( 54, 20, 0.7208556 , 5.33, 'a', 'q'),
(205, 21, 0.89758754, 23.03, 'c', 'k')],
dtype=[('numericfull', '<u2'), ('unitint', 'u1'), ('floatfull', '<f4'), ('floatsmall', '<f4'), ('categoryobj', 'O'), ('stringobj', 'O')])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mltreelib-0.0.2.tar.gz
(17.1 kB
view hashes)
Built Distribution
mltreelib-0.0.2-py3-none-any.whl
(16.6 kB
view hashes)
Close
Hashes for mltreelib-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 742c63f5f3ddcad02af85d37ba70261a2cd47f4af6c0bfebdcefc8cdfd941ef1 |
|
MD5 | f78e291806f4305429f0f5d8ee20c889 |
|
BLAKE2b-256 | 9fc7eeb2918a76a99085713af8952bdbe58711c3a6afa29dcecaef5f6a0aa99f |