ID3 is a Machine Learning Decision Tree Classification Algorithm that uses two methods to build the model. The two methods are Information Gain and Gini Index.
Project description
ID3 Decision Tree Algorithm
ID3 is a Machine Learning Decision Tree Classification Algorithm that uses two methods to build the model. The two methods are Information Gain and Gini Index.
- Version 1.0.0 - Information Gain Only
- Version 2.0.0 - Gini Index added
- Version 2.0.1 - Documentation Sorted
- Version 2.0.2 - All Sorted
Installation
Install directly from my PyPi
pip install classic-ID3-DecisionTree
Or Clone the Repository and install
python3 setup.py install
Parameters
* X_train
The Training Set array consisting of Features.
* y_train
The Training Set array consisting of Outcome.
* dataset
The Entire DataSet.
Attributes
* DecisionTreeClassifier()
Initialise the instance of Decision Tree Classifier class.
* add_features(dataset, result_col_name)
Add the features to the model by sending the dataset. The model will fetch the column features. The second parameter is the column name of outcome array.
* information_gain(X_train, y_train)
To build the decision tree using Information Gain
* gini_index(X_train, y_train)
To build the decision tree using Gini Index
* predict(y_test)
Predict the Test Set Results
Documentation
1. Install the package
pip install classic-ID3-DecisionTree
2. Import the library
from classic_ID3_decision_tree import DecisionTreeClassifier
3. Create an object for Decision Tree Classifier class
id3 = DecisionTreeClassifier()
4. Add Column Features to the model
id3.add_features(dataset, result_col_name)
5. Build the Decision Tree Model using Information Gain
id3.information_gain(X_train, y_train)
OR
5. Build the Decision Tree Model using Gini Index
id3.gini_index(X_train, y_train)
6. Predict the Test Set Results
y_pred = id3.predict(X_test)
Example Code
0. Download the dataset
Download dataset from here
1. Import the dataset and Preprocess
- import numpy as np
- import matplotlib.pyplot as plt
- import pandas as pd
- dataset = pd.read_csv('house-votes-84.csv')
- rawdataset = pd.read_csv('house-votes-84.csv')
- party = {'republican':0, 'democrat':1}
- vote = {'y':1, 'n':0, '?':0}
- for col in dataset.columns:
- if col != 'party':
- dataset[col] = dataset[col].map(vote)
- dataset['party'] = dataset['party'].map(party)
- X = dataset.iloc[:, 1:17].values
- y = dataset.iloc[:, 0].values
- from sklearn.model_selection import KFold
- kf = KFold(n_splits=5)
- for train_index, test_index in kf.split(X,y):
- X_train, X_test = X[train_index], X[test_index]
- y_train, y_test = y[train_index], y[test_index]
2. Use the ID3 Library
- from classic_ID3_decision_tree import DecisionTreeClassifier
- id3 = DecisionTreeClassifier()
- id3.add_features(dataset, 'party')
- print(id3.features)
- id3.information_gain(X_train, y_train)
- OR
- id3.gini_index(X_train, y_train)
- y_pred = ig.predict(X_test)
Footnotes
You can find the code at my Github.
Connect with me on Social Media
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for classic_ID3_DecisionTree-2.0.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e932b8281fe95727472c16f0c20d6d78ff3316932efdc3780646a0e9c7687f3 |
|
MD5 | e1c5294477f3121d24a9a09f0dedc959 |
|
BLAKE2b-256 | ba62f6f88898f31a3a6c31f37325c3a584f79372f25339bac30cb03ec678419e |
Hashes for classic_ID3_DecisionTree-2.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f03a8702d59c99d4a5b5fe27009e3b86f052ad285b0733961aaf163c4d714047 |
|
MD5 | dcf1fbd2ab0cda3d053f81f575459a7c |
|
BLAKE2b-256 | f35db36dc83b9df4da3ef762f11b24408e02b0b504416243b1e1c6babf9e4867 |