Skip to main content

No project description provided

Project description

NAM_Entropy

Reproducible research code for computing and analyzing entropy-based metrics and related experiments. Built with Python 3.12, Poetry, and Jupyter Notebooks for a clean, portable workflow.

Python 3.12 Poetry Tests License: MIT

Table of Contents

Overview

NAM_Entropy provides utilities and experiments for entropy-related analysis. Typical use cases include:

  • Computing Shannon entropy, cross-entropy, KL-divergence, and related information-theoretic measures.
  • Understanding the efficacy of sentence-level and token-level embeddings in a model by computing its information, variation, regularity, disentanglement.

Our target audience is:

  • Cognitive Scientists who want an easy way to understand information-theoretic metrics for the information that a Nerual Network model carry about token and sentences via their embeddings.
  • Maching Learning researchers who are interested in using parts of our efficient implementation in their workflow to understand the entropy and mutual information in their latent space model representations.
  • Other researchers interested in computing entropy and information theory to their work.

Installation

To install the package dependencies, you can use either poetry or conda after changing to the package directory (containing pyproject.toml and environment.yml).

Install with Poetry

poetry install

(This creates a poetry virtual environment: nam-entropy-<hash>.)

Install with Conda

conda env create -f environment.yml

(This creates the conda virtual environment: nam-entropy-conda.)

Project Structure

nam_entropy/
├─ pyproject.toml
├─ poetry.lock
├─ README.md
├─ LICENSE
├─ .gitignore
├─ Makefile                           # optional convenience commands
├─ .env.example                       # example environment variables
├─ notebooks/
│  ├─ A. Automatic entropy and information calculations.ipynb
│  ├─ B. Interactive entropy and information calculations.ipynb
│  ├─ C. Programmatic entropy and information calculations.ipynb
│  ├─ D. Programmatic Spherical Entropy and Information Calculations.ipynb
│  └─ E. Online Spherical Entropy Calculations.ipynb
├─ src/
│  └─ nam_entropy/
│     ├─ __init__.py
│     ├─ bin_distribution_plots.py
│     ├─ data_prep.py
│     ├─ h.py
│     ├─ integrated_distribution_2d_sampler.py
│     ├─ make_data.py
│     └─ utils.py
└─ tests/

Soft Entropy Estimates

This package uses the soft entropy estimation methodology in Conklin (2025), Section 5.4 to provide a discrete approximation to differentiable entropy computation. This approach enables gradient-based optimization while maintaining the interpretability of traditional entropy measures. The key insight is mapping continuous neural representations to probability distributions over learned bin centers, preserving both representational nuances and computational tractability.

Metrics

1. Entropy

We compute the entropy $H(Z)$ of our given vector-valued distribution $Z$ by creating an associated soft-binned probability distrubtion $\mathrm{SoftBin}(Z)$ on a finite set of points randomly associated to the given dataset, and then compute the Shannon entropy $H(\mathrm{SoftBin}(Z))$ of this finite distribution.

For more details on Shannon entropy, see Wikipedia: Entropy or Science Direct: Shannon Entropy.

2. Conditional Entropy

We also can compute the conditional entropy of the given vector-valued distribution $Z$ with respect to a given discrete labelling of $Z$. We think of this as a (categorical) label random variable $L$ valued in the finite set of labels that shares a common probability space with $Z$. Then we compute the conditional entropy $H(Z|L)$ of $Z$ conditioned on (knowing) $L$ by the usual formula

$$H(Z|L) = \sum_{l \in L} p(l) H(Z|L=l)$$

where the entropy conditioned on the particular label value $L = l$ is given by

$$H(Z|L=l) = -\sum_{z \in Z} p(z|l) \log p(z|l),$$

and where

  • $Z$ := Population data distribution (continuous vector-valued RV),
  • $L$ := Sub-population label (categorical RV).

For more details on condtional entropy, see Wikipedia: Conditional Entropy or Science Direct: Conditional Entropy.

3. Mutual Information

The mutual information describes the amount of informatino that of two random variables give about each other. By definition, the mutual information $I(Z, L)$ of the two random variables $Z$ and $L$ is given by

$$I(Z, L) := H(Z) - H(Z|L),$$

where

  • $Z$ := Population data distribution (continuous vector-valued RV),
  • $L$ := Sub-population label (categorical RV).

For more details on mutual information, see Wikipedia: Mutual Information or Science Direct: Mutual Information.

References

Primary Theoretical Framework

Conklin, H. (2025). Information Structure in Mappings: An Approach to Learning, Representation, and Generalisation. PhD Thesis, University of Edinburgh. arXiv:2505.23960. https://arxiv.org/abs/2505.23960

For the mathematical foundations of soft entropy estimation used in this project, see Section 5.4: "Soft Entropy Estimation" (pp. 94-99), which provides the theoretical basis for the continuous-to-discrete entropy approximation methods implemented in our codebase.

Related Publications

Conklin, H., & Smith, K. (2024). Representations as Language: An Information-Theoretic Framework for Interpretability. International Meeting of the Cognitive Science Society. arXiv:2406.02449. https://arxiv.org/abs/2406.02449

Conklin, H., & Smith, K. (2024). Compositionality With Variation Reliably Emerges in Neural Networks. International Conference on Learning Representations (ICLR). https://openreview.net/forum?id=-Yzz6vlX7V-

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nam_entropy-0.1.1.tar.gz (47.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nam_entropy-0.1.1-py3-none-any.whl (48.0 kB view details)

Uploaded Python 3

File details

Details for the file nam_entropy-0.1.1.tar.gz.

File metadata

  • Download URL: nam_entropy-0.1.1.tar.gz
  • Upload date:
  • Size: 47.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.0 CPython/3.12.11 Linux/6.11.0-1018-azure

File hashes

Hashes for nam_entropy-0.1.1.tar.gz
Algorithm Hash digest
SHA256 38a26037e48b22f3499d3c0bfacc46f318189e1a2bf74ec092c5d6256221ac41
MD5 ea8407d107b5b0a698cd642c8a99b131
BLAKE2b-256 252dd254c079284982756d31592cdf4219236366bbefdd298ba57e44c6939543

See more details on using hashes here.

File details

Details for the file nam_entropy-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: nam_entropy-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 48.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.0 CPython/3.12.11 Linux/6.11.0-1018-azure

File hashes

Hashes for nam_entropy-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 798a41c6d9cfe93a2bf5769e529e89e1f0b51bdda6d2a443115e3ae8dda6fe6d
MD5 d4a0873c7930a6be838f22a684a695da
BLAKE2b-256 63916d7c78d39c00137b85c6cdee37051d4b6e3511c17701ff22788dd76572ac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page