Skip to main content

Non-parametric entropy estimation toolbox plus additional features

Project description

npeet_plus

License: MIT

Fork from NPEET (Non-Parametric Entropy Estimation Toolbox) with additional features and improvements for mutual information and entropy estimation. The package provides methods for estimating mutual information, conditional mutual information, KL divergence, and related quantities for both continuous and discrete data.

Features

This package builds on top of the original NPEET by adding new features such as:

  • p-value computation for mutual information using permutation testing (mi_pvalue).
  • Confidence interval estimation for mutual information using bootstrapping (mi_confidence_interval).
  • Extended functionality for conditional mutual information in the midd (discrete mutual information) and micd (mixed mutual information) functions.

Installation

You can install npeet_plus via PyPI:

pip install npeet_plus

Usage

Importing the package

import numpy as np
from npeet_plus import mi, mi_pvalue, mi_confidence_interval

Mutual Information Example

Compute the mutual information between two variables x and y:

x = np.random.randn(1000, 1)
y = x + 0.5 * np.random.randn(1000, 1)

mi_value = mi(x, y)
print(f"Mutual Information: {mi_value}")

Compute p-value for Mutual Information

The function mi_pvalue computes the observed mutual information and estimates the p-value under the null hypothesis of independence using permutation testing:

mi_observed, p_value = mi_pvalue(x, y, k=3, n_permutations=1000)
print(f"Observed MI: {mi_observed}, P-value: {p_value}")

Compute Confidence Interval for Mutual Information

The function mi_confidence_interval computes the observed mutual information and estimates the confidence interval using bootstrapping:

mi_observed, ci_lower, ci_upper, mi_bootstrap = mi_confidence_interval(
    x, y, n_bootstraps=1000, confidence_level=0.95
)
print(f"Observed MI: {mi_observed}")
print(f"95% Confidence Interval: [{ci_lower}, {ci_upper}]")

Functions

entropy(x, k=3, base=2)

Estimate the entropy of a continuous variable x using k-nearest neighbors.

  • Parameters:

    • x: array-like, the variable of interest.
    • k: number of nearest neighbors.
    • base: logarithm base (default is 2).
  • Returns: entropy estimate.

mi(x, y, z=None, k=3, base=2, alpha=0)

Estimate the mutual information between x and y using k-nearest neighbors. Optionally, you can condition on z.

  • Parameters:

    • x, y: array-like, variables for mutual information computation.
    • z: optional, array-like, conditional variable for conditional MI.
    • k: number of nearest neighbors.
    • base: logarithm base (default is 2).
    • alpha: regularization parameter for LNC correction (default is 0).
  • Returns: mutual information estimate.

mi_pvalue(x, y, z=None, mi_type="mi", k=3, base=2, n_permutations=1000, random_state=None, warning=True)

Estimate the p-value of mutual information between x and y under the null hypothesis of independence using permutation testing.

  • Parameters:

    • x, y: array-like, variables for mutual information computation.
    • z: optional, array-like, conditional variable for conditional MI.
    • mi_type: type of mutual information to use ("mi" for continuous, "midd" for discrete, "micd" for mixed).
    • k: number of nearest neighbors.
    • base: logarithm base (default is 2).
    • n_permutations: number of permutations to estimate p-value (default is 1000).
    • random_state: seed for random number generator.
    • warning: whether to show warnings for insufficient data (default is True).
  • Returns: observed mutual information, p-value.

mi_confidence_interval(x, y, z=None, mi_type="mi", k=3, base=2, n_bootstraps=1000, confidence_level=0.95, random_state=None, warning=True)

Estimate the confidence interval for mutual information between x and y using bootstrapping.

  • Parameters:

    • x, y: array-like, variables for mutual information computation.
    • z: optional, array-like, conditional variable for conditional MI.
    • mi_type: type of mutual information to use ("mi" for continuous, "midd" for discrete, "micd" for mixed).
    • k: number of nearest neighbors.
    • base: logarithm base (default is 2).
    • n_bootstraps: number of bootstraps to estimate the confidence interval (default is 1000).
    • confidence_level: confidence level for the interval (default is 0.95).
    • random_state: seed for random number generator.
    • warning: whether to show warnings for insufficient data (default is True).
  • Returns: observed mutual information, lower bound of CI, upper bound of CI, bootstrap MI values.

Modifications from the Original NPEET

  • New functions added:

    • mi_pvalue: Compute the p-value of mutual information using permutation testing.
    • mi_confidence_interval: Compute the confidence interval of mutual information using bootstrapping.
  • Enhancements:

    • Added conditional mutual information computation for both midd and micd functions, enabling more accurate estimations for discrete and mixed data.

Dependencies

  • numpy>=1.18.0
  • scipy>=1.4.0
  • scikit-learn>=0.22.0

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

  • Original codebase: Greg Ver Steeg
  • Author of the original NPEET toolbox: Greg Ver Steeg
  • Author of modifications: Albert Buchard

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

npeet_plus-0.2.0.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

npeet_plus-0.2.0-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file npeet_plus-0.2.0.tar.gz.

File metadata

  • Download URL: npeet_plus-0.2.0.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for npeet_plus-0.2.0.tar.gz
Algorithm Hash digest
SHA256 634b12e0a764e232d471ab3235f0208bb2118842e073b3d63e36d659e35351f6
MD5 d889febd64c1f0a3609dac6735269416
BLAKE2b-256 13c710ab2d6eb3954cdca9223b48e5158413241233eaeb0189737d6998a4d9d9

See more details on using hashes here.

File details

Details for the file npeet_plus-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: npeet_plus-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for npeet_plus-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b0bbe7d8e3b113cf3b8ad8f191b772b9d914be74d2f4517a02d4efc09f82fb8f
MD5 d2a8927434be0bda8aaedc272a22608d
BLAKE2b-256 bc1fb2775d564c9aeb45a409675383904d8b601297300da3be5ad15bf0706ab0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page