Information gain utilities
Project description
info_gain
Implementation of information gain algorithm. There seems to be a debate about how the information gain metric is defined. Whether to use the Kullback-Leibler divergence or the Mutual information as an algorithm to define information gain. This implementation uses the information gain calculation as defined below:
Information gain definitions
Information gain calculation
Definition from information gain calculation (retrieved 2018-07-13).
Let Attr
be the set of all attributes and Ex
the set of all training examples, value(x, a)
with x
in Ex
defines the value of a specific example x
for attribute a
in Attr
, H
specifies the entropy. The values(a)
function denotes the set of all possible values of attribute a
in Attr
. The information gain for an attribute a
in Attr
is defined as follows:
Intrinsic value calculation
Definition from information gain calculation (retrieved 2018-07-13).
Information gain ratio calculation
Definition from information gain calculation (retrieved 2018-07-13).
Installation
To install the package via pip use:
pip install info_gain
To clone the package from the git repository use:
git clone https://github.com/Thijsvanede/info_gain.git
Usage
Import the info_gain
module with:
from info_gain import info_gain
The imported module has supports three methods:
info_gain.info_gain(Ex, a)
to compute the information gain.info_gain.intrinsic_value(Ex, a)
to compute the intrinsic value.info_gain.info_gain_ratio(Ex, a)
to compute the information gain ratio.
Example
from info_gain import info_gain
# Example of color to indicate whether something is fruit or vegatable
produce = ['apple', 'apple', 'apple', 'strawberry', 'eggplant']
fruit = [ True , True , True , True , False ]
colour = ['green', 'green', 'red' , 'red' , 'purple' ]
ig = info_gain.info_gain(fruit, colour)
iv = info_gain.intrinsic_value(fruit, colour)
igr = info_gain.info_gain_ratio(fruit, colour)
print(ig, iv, igr)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for info_gain-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a4f1916be4b0eb51a2389f583fc0490b240e867aadf76dbd2898462270f2be9e |
|
MD5 | 553b84fa4ff15d146b21fa46d3491510 |
|
BLAKE2b-256 | 4153198b263ac9fef93095d21a315007234aff4061132fa95f802ac32c7bfff9 |