Skip to main content

Visual Decision Tree Based on Categorical Attributes Package

Project description

Visual Decision Tree Based on Categorical Attributes


As you may know "scikit-learn" library in python is not able to make a decision tree based on categorical data, and you have to convert categorical data to numerical before passing them to the classifier method. Also, the resulted decision tree is a binary tree while a decision tree does not need to be binary.

Here, we provide a library which is able to make a visual decision tree based on categorical data. You can read more about decision trees here.

Features


The main algorithm which is used is ID3 with the following features:

  • Information gain based on entropy
  • Information gain based on gini
  • Some pruning capabilities like:
    • Minimum number of samples
    • Minimum information gain
  • The resulted tree is not binary

Requirements


You can find all the requirements in "requirements.txt" file, and it can be installed easily by the following command:

  • pip install -r requirements.txt

Also to be able to see visual tree, you need to install graphviz package. Here you can find the right package with respect to your operation system.

Usage


from p_decision_tree.DecisionTree import DecisionTree
import pandas as pd

#Reading CSV file as data set by Pandas
data = pd.read_csv('playtennis.csv')
columns = data.columns

#All columns except the last one are descriptive by default
descriptive_features = columns[:-1]
#The last column is considered as label
label = columns[-1]

#Converting all the columns to string
for column in columns:
    data[column]= data[column].astype(str)

data_descriptive = data[descriptive_features].values
data_label = data[label].values

#Calling DecisionTree constructor (the last parameter is criterion which can also be "gini")
decisionTree = DecisionTree(data_descriptive.tolist(), descriptive_features.tolist(), data_label.tolist(), "entropy")

#Here you can pass pruning features (gain_threshold and minimum_samples)
decisionTree.id3(0,0)

#Visualizing decision tree by Graphviz
dot = decisionTree.print_visualTree( render=True )

# When using Jupyter
#display( dot )

print("System entropy: ", format(decisionTree.entropy))
print("System gini: ", format(decisionTree.gini))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

p_decision_tree-0.0.3.tar.gz (4.6 kB view hashes)

Uploaded Source

Built Distribution

p_decision_tree-0.0.3-py3-none-any.whl (5.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page