Skip to main content

Visual Decision Tree Based on Categorical Attributes Package

Project description

Visual Decision Tree Based on Categorical Attributes


As you may know "scikit-learn" library in python is not able to make a decision tree based on categorical data, and you have to convert categorical data to numerical before passing them to the classifier method. Also, the resulted decision tree is a binary tree while a decision tree does not need to be binary.

Here, we provide a library which is able to make a visual decision tree based on categorical data. You can read more about decision trees here.

Features


The main algorithm which is used is ID3 with the following features:

  • Information gain based on entropy
  • Information gain based on gini
  • Some pruning capabilities like:
    • Minimum number of samples
    • Minimum information gain
  • The resulted tree is not binary

Requirements


You can find all the requirements in "requirements.txt" file, and it can be installed easily by the following command:

  • pip install -r requirements.txt

Also to be able to see visual tree, you need to install graphviz package. Here you can find the right package with respect to your operation system.

Usage


from p_decision_tree.DecisionTree import DecisionTree
import pandas as pd

#Reading CSV file as data set by Pandas
data = pd.read_csv('playtennis.csv')
columns = data.columns

#All columns except the last one are descriptive by default
descriptive_features = columns[:-1]
#The last column is considered as label
label = columns[-1]

#Converting all the columns to string
for column in columns:
    data[column]= data[column].astype(str)

data_descriptive = data[descriptive_features].values
data_label = data[label].values

#Calling DecisionTree constructor (the last parameter is criterion which can also be "gini")
decisionTree = DecisionTree(data_descriptive.tolist(), descriptive_features.tolist(), data_label.tolist(), "entropy")

#Here you can pass pruning features (gain_threshold and minimum_samples)
decisionTree.id3(0,0)

#Visualizing decision tree by Graphviz
dot = decisionTree.print_visualTree( render=True )

# When using Jupyter
#display( dot )

print("System entropy: ", format(decisionTree.entropy))
print("System gini: ", format(decisionTree.gini))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

p_decision_tree-0.0.3.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

p_decision_tree-0.0.3-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file p_decision_tree-0.0.3.tar.gz.

File metadata

  • Download URL: p_decision_tree-0.0.3.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.4.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.5

File hashes

Hashes for p_decision_tree-0.0.3.tar.gz
Algorithm Hash digest
SHA256 9e22596de4417b46ab6385d53a2f436821c7c85e0ea2d1fe036a96c45b7c2a7b
MD5 e5d85eb052c81faf5d1b00e3f9a570d5
BLAKE2b-256 13e080db41242bf0b5dc18622a84fdc9f15fd3fe86e4c78853fc5741d670e864

See more details on using hashes here.

File details

Details for the file p_decision_tree-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: p_decision_tree-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 5.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.4.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.5

File hashes

Hashes for p_decision_tree-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 13b3cfa105a1fa42e7aff50f13f934408733ac680f8c31603c3fd2b2b04e390b
MD5 81b83f8ed8b7df29d12c14121ffe79ff
BLAKE2b-256 95ef12d90f3484192b9202c69946eb1de03cc4e75e43be4ceaec33280b9baf5b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page