Visual Decision Tree Based on Categorical Attributes Package
Project description
Visual Decision Tree Based on Categorical Attributes
As you may know "scikit-learn" library in python is not able to make a decision tree based on categorical data, and you have to convert categorical data to numerical before passing them to the classifier method. Also, the resulted decision tree is a binary tree while a decision tree does not need to be binary.
Here, we provide a library which is able to make a visual decision tree based on categorical data. You can read more about decision trees here.
Features
The main algorithm which is used is ID3 with the following features:
- Information gain based on entropy
- Information gain based on gini
- Some pruning capabilities like:
- Minimum number of samples
- Minimum information gain
- The resulted tree is not binary
Requirements
You can find all the requirements in "requirements.txt" file, and it can be installed easily by the following command:
- pip install -r requirements.txt
Also to be able to see visual tree, you need to install graphviz package. Here you can find the right package with respect to your operation system.
Usage
from p_decision_tree.DecisionTree import DecisionTree
import pandas as pd
#Reading CSV file as data set by Pandas
data = pd.read_csv('playtennis.csv')
columns = data.columns
#All columns except the last one are descriptive by default
descriptive_features = columns[:-1]
#The last column is considered as label
label = columns[-1]
#Converting all the columns to string
for column in columns:
data[column]= data[column].astype(str)
data_descriptive = data[descriptive_features].values
data_label = data[label].values
#Calling DecisionTree constructor (the last parameter is criterion which can also be "gini")
decisionTree = DecisionTree(data_descriptive.tolist(), descriptive_features.tolist(), data_label.tolist(), "entropy")
#Here you can pass pruning features (gain_threshold and minimum_samples)
decisionTree.id3(0,0)
#Visualizing decision tree by Graphviz
dot = decisionTree.print_visualTree( render=True )
# When using Jupyter
#display( dot )
print("System entropy: ", format(decisionTree.entropy))
print("System gini: ", format(decisionTree.gini))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file p_decision_tree-0.0.3.tar.gz
.
File metadata
- Download URL: p_decision_tree-0.0.3.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.4.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e22596de4417b46ab6385d53a2f436821c7c85e0ea2d1fe036a96c45b7c2a7b |
|
MD5 | e5d85eb052c81faf5d1b00e3f9a570d5 |
|
BLAKE2b-256 | 13e080db41242bf0b5dc18622a84fdc9f15fd3fe86e4c78853fc5741d670e864 |
File details
Details for the file p_decision_tree-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: p_decision_tree-0.0.3-py3-none-any.whl
- Upload date:
- Size: 5.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.4.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13b3cfa105a1fa42e7aff50f13f934408733ac680f8c31603c3fd2b2b04e390b |
|
MD5 | 81b83f8ed8b7df29d12c14121ffe79ff |
|
BLAKE2b-256 | 95ef12d90f3484192b9202c69946eb1de03cc4e75e43be4ceaec33280b9baf5b |