Skip to main content

Bias-corrected participation ratio estimator for measuring dimensionality of neural representations

Project description

Sample-size invariant measure of dimensionality

Bias-corrected participation ratio (PR) estimators for measuring the dimensionality of neural representation manifolds, as introduced in:

Estimating Dimensionality of Neural Representations from Finite Samples Chanwoo Chun*, Abdulkadir Canatar*, SueYeon Chung, Daniel Lee ICLR 2026


📐 Background

Given a neural activation matrix $\Phi \in \mathbb{R}^{P\times Q}$ (P stimuli × Q neurons), the participation ratio

$$ \gamma =\frac{\left(\sum_i \lambda_i \right)^2}{\sum_i \lambda_i^2} $$

is a soft count of the number of nonzero eigenvalues of the stimulus covariance $K=\frac{1}{Q}\Phi\Phi^\top$. The naive estimator is severely biased downward when P or Q is small — it behaves approximately as a harmonic mean of P, Q, and the true $\gamma$:

$$ \mathbb{E}\left[ \frac{1}{\gamma_{\text{naive}}}\right] \approx \frac{1}{P}+ \frac{1}{Q} +\frac{1}{\gamma} $$

This package provides unbiased estimators that correct for finite P and/or Q by averaging over disjoint index sets.

Estimator Corrects Use when
γ_naive nothing baseline reference
γ_row row (stimulus) sampling bias full neuron access, subsampled stimuli
γ_col column (neuron) sampling bias full stimulus access, subsampled neurons
γ_both both row and column bias subsampled neurons, subsampled stimuli

An additional participation_ratio_finite estimator handles the case where Φ is a submatrix sampled without replacement from a large-but-finite R×C matrix (Appendix A.6 of the paper).


📦 Installation

pip install dimensionality

Or, for development:

git clone https://github.com/badooki/dimensionality.git
cd dimensionality
pip install -e ".[dev]"

Dependencies: numpy >= 1.24, opt_einsum >= 3.3. Python ≥ 3.9.


🚀 Quick start

import numpy as np
from dimensionality import participation_ratio

# Phi: P stimuli × Q neurons  (do NOT pre-center)
Phi = np.random.randn(200, 100)

# Default: γ_both (bias-corrected for both row and column subsampling)
gamma = participation_ratio(Phi)
print(gamma)

All four estimators

result = participation_ratio(Phi, return_all=True)
# result['naive'], result['row'], result['col'], result['both']

Two-trial noise correction

When two independent repeat trials are available for the same stimuli and neurons, the cross-trial construction removes additive and multiplicative noise bias:

gamma = participation_ratio(Phi1, Phi2)

Return numerator and denominator separately

result = participation_ratio(Phi, return_parts=True)
# result['both'], result['A'], result['B']

# Combined with return_all:
result = participation_ratio(Phi, return_all=True, return_parts=True)
# result['naive'], result['A_naive'], result['B_naive'], ...

Neuron dimensionality

To estimate dimensionality along the neuron axis (centering across stimuli), transpose the matrix:

gamma_neuron = participation_ratio(Phi.T)

🔢 Finite underlying matrix

When Φ is a P×Q submatrix sampled without replacement from a finite R×C population matrix, use participation_ratio_finite:

from dimensionality import participation_ratio_finite

gamma = participation_ratio_finite(Phi, R=5000, C=2000)

# With noise correction:
gamma = participation_ratio_finite(Phi1, R=5000, C=2000, Phi2=Phi2)

# Also return the naive estimate:
result = participation_ratio_finite(Phi, R=5000, C=2000, return_naive=True)
# result['gamma'], result['naive']

📊 Subsampling sweep

To assess how the estimate converges with sample size, sweep over P or Q:

from dimensionality import sweep_dimensionality, plot_sweep

# Sweep over number of stimuli; keep all neurons
result = sweep_dimensionality(Phi, axis='P', n_trials=20)

# result['values']  — array of P values used
# result['naive'], result['row'], result['col'], result['both']  — mean estimates
# result['both_sem']  — standard error of the mean

fig, ax = plot_sweep(result, true_d=50)

To sweep over number of neurons instead:

result = sweep_dimensionality(Phi, axis='Q')

For the finite estimator:

result = sweep_dimensionality(Phi, axis='P', estimator='finite', R=5000, C=2000)
# result['naive'], result['gamma']

⚠️ Important: do not pre-center

The bias corrections rely on an algebraic three-term centering structure built into the estimator formulas. Subtracting column means from Φ before passing it to the estimator introduces statistical dependencies between rows that break the bias correction. Pass the raw activation matrix directly.


🔧 API reference

participation_ratio(Phi, Phi2=None, *, return_all=False, return_parts=False)

Estimate the task dimensionality (PR of the centered covariance) of Φ.

  • Phi — raw activation matrix, shape (P, Q); P ≥ 4, Q ≥ 2.
  • Phi2 — optional second trial for noise correction.
  • return_all — if True, return dict with all four estimator variants.
  • return_parts — if True, include numerator A and denominator B.

Returns a scalar (γ_both) by default, or a dict when either flag is set.


participation_ratio_finite(Phi, R, C, Phi2=None, *, return_naive=False, return_parts=False)

Estimate the PR of the full R×C matrix from the observed P×Q submatrix.

  • R, C — number of rows/columns in the full underlying matrix; R ≥ P, C ≥ Q.
  • return_naive — if True, also return the (uncorrected) naive estimate.
  • return_parts — if True, include numerator A and denominator B.

Returns a scalar by default, or a dict when either flag is set.


sweep_dimensionality(Phi, axis='P', values=None, n_trials=20, Phi2=None, estimator='infinite', R=None, C=None, ...)

Run a subsampling sweep. Returns a dict with mean estimates and SEMs at each value. See docstring for full parameter list.


plot_sweep(result, ax=None, true_d=None, title=None, figsize=(5, 4))

Plot the output of sweep_dimensionality. Returns (fig, ax).


🗂️ Repository structure

src/
  dimensionality/
    __init__.py        # public API
    _core.py           # quartic einsum helper
    estimators.py      # participation_ratio
    finite.py          # participation_ratio_finite
    sweep.py           # sweep_dimensionality
    plot.py            # plot_sweep
tests/
  test_estimators.py
examples/
  demo.ipynb           # interactive walkthrough on synthetic data
  synthetic.py         # Figure 1 reproduction

📄 Citation

If you use this package, please cite:

@inproceedings{chun2026estimating,
  title     = {Estimating Dimensionality of Neural Representations from Finite Samples},
  author    = {Chun, Chanwoo and Canatar, Abdulkadir and Chung, SueYeon and Lee, Daniel},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dimensionality-0.1.4.tar.gz (223.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dimensionality-0.1.4-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file dimensionality-0.1.4.tar.gz.

File metadata

  • Download URL: dimensionality-0.1.4.tar.gz
  • Upload date:
  • Size: 223.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for dimensionality-0.1.4.tar.gz
Algorithm Hash digest
SHA256 7f2c5a95a8cd5889f00d266c46205e7ce7086ea6cde9762887ef90537242005b
MD5 8e59d7d57bbf0fa530da2cc637657284
BLAKE2b-256 0635557f5591e241958f62392092565b47c7e2ea5e58a9eacdd838b51272e905

See more details on using hashes here.

File details

Details for the file dimensionality-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: dimensionality-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for dimensionality-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 45ba67fb4818af7c97c8d4e2a7a28b0b7c244c82f1e67e84e57e9a4de70b3976
MD5 02b0f99eb60c8b10aa3917c4e2bb3cdf
BLAKE2b-256 0cf8d5b696c33c716120d466b3ab73c925848c0b2f6ad77a98253f460dca7ba5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page