Bias-corrected participation ratio estimator for measuring dimensionality of neural representations
Project description
Sample-size invariant measure of dimensionality
Bias-corrected participation ratio (PR) estimators for measuring the dimensionality of neural representation manifolds, as introduced in:
Estimating Dimensionality of Neural Representations from Finite Samples Chanwoo Chun*, Abdulkadir Canatar*, SueYeon Chung, Daniel Lee ICLR 2026
📐 Background
Given a neural activation matrix $\Phi \in \mathbb{R}^{P\times Q}$ (P stimuli × Q neurons), the participation ratio
$$ \gamma =\frac{\left(\sum_i \lambda_i \right)^2}{\sum_i \lambda_i^2} $$
is a soft count of the number of nonzero eigenvalues of the stimulus covariance $K=\frac{1}{Q}\Phi\Phi^\top$. The naive estimator is severely biased downward when P or Q is small — it behaves approximately as a harmonic mean of P, Q, and the true $\gamma$:
$$ \mathbb{E}\left[ \frac{1}{\gamma_{\text{naive}}}\right] \approx \frac{1}{P}+ \frac{1}{Q} +\frac{1}{\gamma} $$
This package provides unbiased estimators that correct for finite P and/or Q by averaging over disjoint index sets.
| Estimator | Corrects | Use when |
|---|---|---|
γ_naive |
nothing | baseline reference |
γ_row |
row (stimulus) sampling bias | full neuron access, subsampled stimuli |
γ_col |
column (neuron) sampling bias | full stimulus access, subsampled neurons |
γ_both |
both row and column bias | subsampled neurons, subsampled stimuli |
An additional participation_ratio_finite estimator handles the case where Φ is a submatrix sampled without replacement from a large-but-finite R×C matrix (Appendix A.6 of the paper).
📦 Installation
pip install dimensionality
Or, for development:
git clone https://github.com/badooki/dimensionality.git
cd dimensionality
pip install -e ".[dev]"
Dependencies: numpy >= 1.24, opt_einsum >= 3.3. Python ≥ 3.9.
🚀 Quick start
import numpy as np
from dimensionality import participation_ratio
# Phi: P stimuli × Q neurons (do NOT pre-center)
Phi = np.random.randn(200, 100)
# Default: γ_both (bias-corrected for both row and column subsampling)
gamma = participation_ratio(Phi)
print(gamma)
All four estimators
result = participation_ratio(Phi, return_all=True)
# result['naive'], result['row'], result['col'], result['both']
Two-trial noise correction
When two independent repeat trials are available for the same stimuli and neurons, the cross-trial construction removes additive and multiplicative noise bias:
gamma = participation_ratio(Phi1, Phi2)
Return numerator and denominator separately
result = participation_ratio(Phi, return_parts=True)
# result['both'], result['A'], result['B']
# Combined with return_all:
result = participation_ratio(Phi, return_all=True, return_parts=True)
# result['naive'], result['A_naive'], result['B_naive'], ...
Neuron dimensionality
To estimate dimensionality along the neuron axis (centering across stimuli), transpose the matrix:
gamma_neuron = participation_ratio(Phi.T)
🔢 Finite underlying matrix
When Φ is a P×Q submatrix sampled without replacement from a finite R×C population matrix, use participation_ratio_finite:
from dimensionality import participation_ratio_finite
gamma = participation_ratio_finite(Phi, R=5000, C=2000)
# With noise correction:
gamma = participation_ratio_finite(Phi1, R=5000, C=2000, Phi2=Phi2)
# Also return the naive estimate:
result = participation_ratio_finite(Phi, R=5000, C=2000, return_naive=True)
# result['gamma'], result['naive']
📊 Subsampling sweep
To assess how the estimate converges with sample size, sweep over P or Q:
from dimensionality import sweep_dimensionality, plot_sweep
# Sweep over number of stimuli; keep all neurons
result = sweep_dimensionality(Phi, axis='P', n_trials=20)
# result['values'] — array of P values used
# result['naive'], result['row'], result['col'], result['both'] — mean estimates
# result['both_sem'] — standard error of the mean
fig, ax = plot_sweep(result, true_d=50)
To sweep over number of neurons instead:
result = sweep_dimensionality(Phi, axis='Q')
For the finite estimator:
result = sweep_dimensionality(Phi, axis='P', estimator='finite', R=5000, C=2000)
# result['naive'], result['gamma']
⚠️ Important: do not pre-center
The bias corrections rely on an algebraic three-term centering structure built into the estimator formulas. Subtracting column means from Φ before passing it to the estimator introduces statistical dependencies between rows that break the bias correction. Pass the raw activation matrix directly.
🔧 API reference
participation_ratio(Phi, Phi2=None, *, return_all=False, return_parts=False)
Estimate the task dimensionality (PR of the centered covariance) of Φ.
- Phi — raw activation matrix, shape (P, Q); P ≥ 4, Q ≥ 2.
- Phi2 — optional second trial for noise correction.
- return_all — if
True, return dict with all four estimator variants. - return_parts — if
True, include numerator A and denominator B.
Returns a scalar (γ_both) by default, or a dict when either flag is set.
participation_ratio_finite(Phi, R, C, Phi2=None, *, return_naive=False, return_parts=False)
Estimate the PR of the full R×C matrix from the observed P×Q submatrix.
- R, C — number of rows/columns in the full underlying matrix; R ≥ P, C ≥ Q.
- return_naive — if
True, also return the (uncorrected) naive estimate. - return_parts — if
True, include numerator A and denominator B.
Returns a scalar by default, or a dict when either flag is set.
sweep_dimensionality(Phi, axis='P', values=None, n_trials=20, Phi2=None, estimator='infinite', R=None, C=None, ...)
Run a subsampling sweep. Returns a dict with mean estimates and SEMs at each value. See docstring for full parameter list.
plot_sweep(result, ax=None, true_d=None, title=None, figsize=(5, 4))
Plot the output of sweep_dimensionality. Returns (fig, ax).
🗂️ Repository structure
src/
dimensionality/
__init__.py # public API
_core.py # quartic einsum helper
estimators.py # participation_ratio
finite.py # participation_ratio_finite
sweep.py # sweep_dimensionality
plot.py # plot_sweep
tests/
test_estimators.py
examples/
demo.ipynb # interactive walkthrough on synthetic data
synthetic.py # Figure 1 reproduction
📄 Citation
If you use this package, please cite:
@inproceedings{chun2026estimating,
title = {Estimating Dimensionality of Neural Representations from Finite Samples},
author = {Chun, Chanwoo and Canatar, Abdulkadir and Chung, SueYeon and Lee, Daniel},
booktitle = {International Conference on Learning Representations},
year = {2026},
}
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dimensionality-0.1.3.tar.gz.
File metadata
- Download URL: dimensionality-0.1.3.tar.gz
- Upload date:
- Size: 38.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03a9cc84bdbc758cb3b70abe9519132c5ac2832b065e97e8da196e8dff60ed6d
|
|
| MD5 |
668fd1a4c3f08b95727140c54da7ce7e
|
|
| BLAKE2b-256 |
bdd6e9802511b04b10386eab86f7e6154b1b0cf3d4bdc1c5df716586d0d63dae
|
File details
Details for the file dimensionality-0.1.3-py3-none-any.whl.
File metadata
- Download URL: dimensionality-0.1.3-py3-none-any.whl
- Upload date:
- Size: 16.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f92e1c1519009143a40ed4a3d455ba8ea53a170e5b4252e3a416bdb53d95123e
|
|
| MD5 |
4cc037682dec8c912015163fd7dd3277
|
|
| BLAKE2b-256 |
dc0d78ccbf0638f2de2b38b3f30b399b2f997fc3b1bcb517b38696666ea1591e
|