Skip to main content

Information gain utilities

Project description

info_gain

Implementation of information gain algorithm. There seems to be a debate about how the information gain metric is defined. Whether to use the Kullback-Leibler divergence or the Mutual information as an algorithm to define information gain. This implementation uses the information gain calculation as defined below:

Information gain definitions

Information gain calculation

Definition from information gain calculation (retrieved 2018-07-13). Let Attr be the set of all attributes and Ex the set of all training examples, value(x, a) with x in Ex defines the value of a specific example x for attribute a in Attr, H specifies the entropy. The values(a) function denotes the set of all possible values of attribute a in Attr. The information gain for an attribute a in Attr is defined as follows:

Information gain formula

Intrinsic value calculation

Definition from information gain calculation (retrieved 2018-07-13).

Intrinsic value calculation

Information gain ratio calculation

Definition from information gain calculation (retrieved 2018-07-13).

Intrinsic value calculation

Installation

To install the package via pip use:

pip install info_gain

To clone the package from the git repository use:

git clone https://github.com/Thijsvanede/info_gain.git

Usage

Import the info_gain module with:

from info_gain import info_gain

The imported module has supports three methods:

  • info_gain.info_gain(Ex, a) to compute the information gain.
  • info_gain.intrinsic_value(Ex, a) to compute the intrinsic value.
  • info_gain.info_gain_ratio(Ex, a) to compute the information gain ratio.

Example

from info_gain import info_gain

# Example of color to indicate whether something is fruit or vegatable
produce = ['apple', 'apple', 'apple', 'strawberry', 'eggplant']
fruit   = [ True  ,  True  ,  True  ,  True       ,  False    ]
colour  = ['green', 'green', 'red'  , 'red'       , 'purple'  ]

ig  = info_gain.info_gain(fruit, colour)
iv  = info_gain.intrinsic_value(fruit, colour)
igr = info_gain.info_gain_ratio(fruit, colour)

print(ig, iv, igr)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

info_gain-1.0.tar.gz (2.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

info_gain-1.0-py3-none-any.whl (3.0 kB view details)

Uploaded Python 3

File details

Details for the file info_gain-1.0.tar.gz.

File metadata

  • Download URL: info_gain-1.0.tar.gz
  • Upload date:
  • Size: 2.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for info_gain-1.0.tar.gz
Algorithm Hash digest
SHA256 3e8b28f552a6223c68e00ac3cf7fd6368cff6470911ae86d7415d6dc08939bb7
MD5 54de41770272303f0e74f7d223fd5eda
BLAKE2b-256 5b4568aabe58688ed41c8ad719b53b6659bf0b0ce03790b1872f018dc0c7e026

See more details on using hashes here.

File details

Details for the file info_gain-1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for info_gain-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5109869e2c95aff60a35fdc64d41164b7e7d27027a5184f762bbcdf117e65a92
MD5 97647a23268df5d1ca4ce41343ff0d1f
BLAKE2b-256 5382cfb0a2c9c15cbee52cb10e6f4ea778947f01cc487ca84d7351b20f8318c2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page