Non-parametric entropy estimation toolbox plus additional features

These details have not been verified by PyPI

Project links

Project description

npeet_plus

Fork from NPEET (Non-Parametric Entropy Estimation Toolbox) with additional features and improvements for mutual information and entropy estimation. The package provides methods for estimating mutual information, conditional mutual information, KL divergence, and related quantities for both continuous and discrete data.

Features

This package builds on top of the original NPEET by adding new features such as:

p-value computation for mutual information using permutation testing (mi_pvalue).
Confidence interval estimation for mutual information using bootstrapping (mi_confidence_interval).
Extended functionality for conditional mutual information in the midd (discrete mutual information) and micd (mixed mutual information) functions.

Installation

You can install npeet_plus via PyPI:

pip install npeet_plus

Usage

Importing the package

import numpy as np
from npeet_plus import mi, mi_pvalue, mi_confidence_interval

Mutual Information Example

Compute the mutual information between two variables x and y:

x = np.random.randn(1000, 1)
y = x + 0.5 * np.random.randn(1000, 1)

mi_value = mi(x, y)
print(f"Mutual Information: {mi_value}")

Compute p-value for Mutual Information

The function mi_pvalue computes the observed mutual information and estimates the p-value under the null hypothesis of independence using permutation testing:

mi_observed, p_value = mi_pvalue(x, y, k=3, n_permutations=1000)
print(f"Observed MI: {mi_observed}, P-value: {p_value}")

Compute Confidence Interval for Mutual Information

The function mi_confidence_interval computes the observed mutual information and estimates the confidence interval using bootstrapping:

mi_observed, ci_lower, ci_upper, mi_bootstrap = mi_confidence_interval(
    x, y, n_bootstraps=1000, confidence_level=0.95
)
print(f"Observed MI: {mi_observed}")
print(f"95% Confidence Interval: [{ci_lower}, {ci_upper}]")

Functions

`entropy(x, k=3, base=2)`

Estimate the entropy of a continuous variable x using k-nearest neighbors.

Parameters:
- x: array-like, the variable of interest.
- k: number of nearest neighbors.
- base: logarithm base (default is 2).
Returns: entropy estimate.

`mi(x, y, z=None, k=3, base=2, alpha=0)`

Estimate the mutual information between x and y using k-nearest neighbors. Optionally, you can condition on z.

Parameters:
- x, y: array-like, variables for mutual information computation.
- z: optional, array-like, conditional variable for conditional MI.
- k: number of nearest neighbors.
- base: logarithm base (default is 2).
- alpha: regularization parameter for LNC correction (default is 0).
Returns: mutual information estimate.

`mi_pvalue(x, y, z=None, mi_type="mi", k=3, base=2, n_permutations=1000, random_state=None, warning=True)`

Estimate the p-value of mutual information between x and y under the null hypothesis of independence using permutation testing.

Parameters:
- x, y: array-like, variables for mutual information computation.
- z: optional, array-like, conditional variable for conditional MI.
- mi_type: type of mutual information to use ("mi" for continuous, "midd" for discrete, "micd" for mixed).
- k: number of nearest neighbors.
- base: logarithm base (default is 2).
- n_permutations: number of permutations to estimate p-value (default is 1000).
- random_state: seed for random number generator.
- warning: whether to show warnings for insufficient data (default is True).
Returns: observed mutual information, p-value.

`mi_confidence_interval(x, y, z=None, mi_type="mi", k=3, base=2, n_bootstraps=1000, confidence_level=0.95, random_state=None, warning=True)`

Estimate the confidence interval for mutual information between x and y using bootstrapping.

Parameters:
- x, y: array-like, variables for mutual information computation.
- z: optional, array-like, conditional variable for conditional MI.
- mi_type: type of mutual information to use ("mi" for continuous, "midd" for discrete, "micd" for mixed).
- k: number of nearest neighbors.
- base: logarithm base (default is 2).
- n_bootstraps: number of bootstraps to estimate the confidence interval (default is 1000).
- confidence_level: confidence level for the interval (default is 0.95).
- random_state: seed for random number generator.
- warning: whether to show warnings for insufficient data (default is True).
Returns: observed mutual information, lower bound of CI, upper bound of CI, bootstrap MI values.

Modifications from the Original NPEET

New functions added:
- mi_pvalue: Compute the p-value of mutual information using permutation testing.
- mi_confidence_interval: Compute the confidence interval of mutual information using bootstrapping.
Enhancements:
- Added conditional mutual information computation for both midd and micd functions, enabling more accurate estimations for discrete and mixed data.

Dependencies

numpy>=1.18.0
scipy>=1.4.0
scikit-learn>=0.22.0

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Original codebase: Greg Ver Steeg
Author of the original NPEET toolbox: Greg Ver Steeg
Author of modifications: Albert Buchard

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Oct 14, 2024

0.1.0

Oct 14, 2024

This version

0.0.1

Oct 14, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

npeet_plus-0.0.1.tar.gz (12.4 kB view hashes)

Uploaded Oct 14, 2024 Source

Built Distribution

npeet_plus-0.0.1-py3-none-any.whl (14.5 kB view hashes)

Uploaded Oct 14, 2024 Python 3

Hashes for npeet_plus-0.0.1.tar.gz

Hashes for npeet_plus-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`f0f3642c6bf63d0bdce6289b4c9fa3e28120e95756b1868af3d9e1a1fd52c60f`
MD5	`d6970c40c487000f241f34b603744572`
BLAKE2b-256	`af80c28e544b711812be9b5516a6b62d5d1c4b894868afa03b1750f38e2bf552`

Hashes for npeet_plus-0.0.1-py3-none-any.whl

Hashes for npeet_plus-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c3d6833a1d6a069a12305aa9cbe2361ac50698ed95a655d7a5c8b4a7906d3a18`
MD5	`e06d0fa47620b2c9acc47a9ea9a64d61`
BLAKE2b-256	`61d4b915738430d353ec20aa40f13c6a101599bb941bc7c26b3df2dfb5f8c44a`