Skip to main content

Information gain utilities

Project description

info_gain

Implementation of information gain algorithm. There seems to be a debate about how the information gain metric is defined. Whether to use the Kullback-Leibler divergence or the Mutual information as an algorithm to define information gain. This implementation uses the information gain calculation as defined below:

Information gain definitions

Information gain calculation

Definition from information gain calculation (retrieved 2018-07-13). Let Attr be the set of all attributes and Ex the set of all training examples, value(x, a) with x in Ex defines the value of a specific example x for attribute a in Attr, H specifies the entropy. The values(a) function denotes the set of all possible values of attribute a in Attr. The information gain for an attribute a in Attr is defined as follows:

Information gain formula

Intrinsic value calculation

Definition from information gain calculation (retrieved 2018-07-13).

Intrinsic value calculation

Information gain ratio calculation

Definition from information gain calculation (retrieved 2018-07-13).

Intrinsic value calculation

Installation

To install the package via pip use:

pip install info_gain

To clone the package from the git repository use:

git clone https://github.com/Thijsvanede/info_gain.git

Usage

Import the info_gain module with:

from info_gain import info_gain

The imported module has supports three methods:

  • info_gain.info_gain(Ex, a) to compute the information gain.
  • info_gain.intrinsic_value(Ex, a) to compute the intrinsic value.
  • info_gain.info_gain_ratio(Ex, a) to compute the information gain ratio.

Example

from info_gain import info_gain

# Example of color to indicate whether something is fruit or vegatable
produce = ['apple', 'apple', 'apple', 'strawberry', 'eggplant']
fruit   = [ True  ,  True  ,  True  ,  True       ,  False    ]
colour  = ['green', 'green', 'red'  , 'red'       , 'purple'  ]

ig  = info_gain.info_gain(fruit, colour)
iv  = info_gain.intrinsic_value(fruit, colour)
igr = info_gain.info_gain_ratio(fruit, colour)

print(ig, iv, igr)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

info_gain-1.0.1.tar.gz (3.0 kB view details)

Uploaded Source

Built Distribution

info_gain-1.0.1-py3-none-any.whl (3.3 kB view details)

Uploaded Python 3

File details

Details for the file info_gain-1.0.1.tar.gz.

File metadata

  • Download URL: info_gain-1.0.1.tar.gz
  • Upload date:
  • Size: 3.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for info_gain-1.0.1.tar.gz
Algorithm Hash digest
SHA256 e8159d09c58e7302507cea9ebc8e6b1c04310e7f9b30a99f831554d4f772e9c1
MD5 81965db77e37d4a9d181a3a9ba47836f
BLAKE2b-256 74dab7ac47b517b47ca3f0bcf87a8ed3f17c2b1978c4df9f000e0ac577b2106e

See more details on using hashes here.

File details

Details for the file info_gain-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for info_gain-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a4f1916be4b0eb51a2389f583fc0490b240e867aadf76dbd2898462270f2be9e
MD5 553b84fa4ff15d146b21fa46d3491510
BLAKE2b-256 4153198b263ac9fef93095d21a315007234aff4061132fa95f802ac32c7bfff9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page