Skip to main content

ID3 is a Machine Learning Decision Tree Classification Algorithm that uses two methods to build the model. The two methods are Information Gain and Gini Index.

Project description

ID3 Decision Tree Algorithm

ID3 is a Machine Learning Decision Tree Classification Algorithm that uses two methods to build the model. The two methods are Information Gain and Gini Index.

  • Version 1.0.0 - Information Gain Only
  • Version 2.0.0 - Gini Index added
  • Version 2.0.1 - Documentation Sorted
  • Version 2.0.2 - All Sorted

Installation

Install directly from my PyPi

pip install classic-ID3-DecisionTree

Or Clone the Repository and install

python3 setup.py install

Parameters

* X_train


The Training Set array consisting of Features.

* y_train


The Training Set array consisting of Outcome.

* dataset


The Entire DataSet.

Attributes

* DecisionTreeClassifier()


Initialise the instance of Decision Tree Classifier class.

* add_features(dataset, result_col_name)


Add the features to the model by sending the dataset. The model will fetch the column features. The second parameter is the column name of outcome array.

* information_gain(X_train, y_train)


To build the decision tree using Information Gain

* gini_index(X_train, y_train)


To build the decision tree using Gini Index

* predict(y_test)


Predict the Test Set Results

Documentation

1. Install the package

pip install classic-ID3-DecisionTree

2. Import the library

from classic_ID3_decision_tree import DecisionTreeClassifier

3. Create an object for Decision Tree Classifier class

id3 = DecisionTreeClassifier()

4. Add Column Features to the model

id3.add_features(dataset, result_col_name)

5. Build the Decision Tree Model using Information Gain

id3.information_gain(X_train, y_train)

OR

5. Build the Decision Tree Model using Gini Index

id3.gini_index(X_train, y_train)

6. Predict the Test Set Results

y_pred = id3.predict(X_test)


Example Code

0. Download the dataset

Download dataset from here

1. Import the dataset and Preprocess

  • import numpy as np
  • import matplotlib.pyplot as plt
  • import pandas as pd
  • dataset = pd.read_csv('house-votes-84.csv')
  • rawdataset = pd.read_csv('house-votes-84.csv')
  • party = {'republican':0, 'democrat':1}
  • vote = {'y':1, 'n':0, '?':0}
  • for col in dataset.columns:
    • if col != 'party':
      • dataset[col] = dataset[col].map(vote)
  • dataset['party'] = dataset['party'].map(party)
  • X = dataset.iloc[:, 1:17].values
  • y = dataset.iloc[:, 0].values
  • from sklearn.model_selection import KFold
  • kf = KFold(n_splits=5)
  • for train_index, test_index in kf.split(X,y):
    • X_train, X_test = X[train_index], X[test_index]
    • y_train, y_test = y[train_index], y[test_index]

2. Use the ID3 Library

  • from classic_ID3_decision_tree import DecisionTreeClassifier
  • id3 = DecisionTreeClassifier()
  • id3.add_features(dataset, 'party')
  • print(id3.features)
  • id3.information_gain(X_train, y_train)
  • OR
  • id3.gini_index(X_train, y_train)
  • y_pred = ig.predict(X_test)

Footnotes

You can find the code at my Github.

Connect with me on Social Media

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

classic_ID3_DecisionTree-2.0.3.tar.gz (7.5 kB view hashes)

Uploaded Source

Built Distribution

classic_ID3_DecisionTree-2.0.3-py3-none-any.whl (8.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page