Persistent homology on Fisher information distances for probability manifolds
Project description
fisher-homology
Persistent homology on Fisher information distances for probability manifolds.
A pure-Python, zero-dependency implementation of topological data analysis (TDA) designed specifically for probability trajectory analysis. Uses the Fisher information arc length as the filtration metric, which correctly expands distances at the tails of probability distributions — exactly where critical events in fraud detection, medical diagnostics, physics experiments, and financial risk live.
Why Fisher distances?
Standard topological data analysis uses Euclidean distance. For probability trajectories, this is the wrong metric.
Euclidean distance treats p=0.001 and p=0.002 the same as p=0.499 and
p=0.500 — both have |Δp| = 0.001. But informationally these are completely
different: the first pair are both rare events with a large relative difference
(the second is twice the first), while the second pair are near-average values
with negligible relative difference.
The Fisher arc length is the geodesic distance on the statistical manifold of
Bernoulli distributions, equipped with the Fisher information metric
g(p) = 1/(p(1−p)):
d_F(p, q) = 2 · |arcsin(√p) − arcsin(√q)|
Properties:
- Range
[0, π]— the full manifold diameter d_F(0, 1) = π— maximum separation- Symmetric and satisfies the triangle inequality (true metric)
- At
p = 0.01, Fisher expands distances ~10× vs Euclidean - Produces topologically meaningful features for probability trajectories
Installation
pip install fisher-homology
No dependencies required. Works on any Python 3.8+ installation.
Optional dependencies
pip install fisher-homology[numpy] # faster distance computation
pip install fisher-homology[plot] # matplotlib visualization helpers
pip install fisher-homology[dev] # pytest for running the test suite
From source
git clone https://github.com/williamrwilliamson/fisher-homology
cd fisher-homology
pip install -e .
Quick Start
from fisher_homology import FisherHomology
import numpy as np
# Three-phase trajectory: normal → stress → crisis
np.random.seed(42)
states = []
for t in range(15):
if t < 5: p = [0.10 + 0.01*np.random.randn(), 0.12 + 0.01*np.random.randn()]
elif t < 10: p = [0.45 + 0.03*np.random.randn(), 0.50 + 0.03*np.random.randn()]
else: p = [0.85 + 0.02*np.random.randn(), 0.88 + 0.02*np.random.randn()]
states.append([float(np.clip(x, 0.01, 0.99)) for x in p])
ph = FisherHomology()
result = ph.fit(states)
print(result.summary())
# Persistence Diagram (fisher metric)
# States: 15
# Max epsilon: 4.284
# β₀ features: 12
# Bottleneck width: 1.471
# Phase gaps at ε: ['1.249', '1.353']
# Estimated phases: 3
# Has cycles (β₁): False
Core API
FisherHomology
ph = FisherHomology(
n_steps=50, # filtration resolution
max_epsilon=None, # auto-determined from max pairwise distance
)
fit(states, metric='fisher') → PersistenceDiagram
Compute persistent homology of a probability trajectory.
# states: list of T probability vectors, each of length n
# All probabilities must be in (0, 1)
result = ph.fit(states)
compare_trajectories(states_a, states_b) → dict
Compare two trajectories using bottleneck distance.
comparison = ph.compare_trajectories(trajectory_1, trajectory_2)
print(comparison['bottleneck_b0']) # topological distance
print(comparison['interpretation']) # human-readable verdict
rips_at_epsilon(states, epsilon) → dict
Snapshot of the Vietoris-Rips complex at a specific ε.
snapshot = ph.rips_at_epsilon(states, epsilon=0.5)
print(snapshot['beta_0'], snapshot['beta_1'])
print(snapshot['euler_characteristic'])
fit_transform(states, return_both_metrics=False) → dict
Compute Fisher and optionally Euclidean diagrams for comparison.
both = ph.fit_transform(states, return_both_metrics=True)
fisher_diag = both['fisher']
euclidean_diag = both['euclidean']
PersistenceDiagram
Result container returned by FisherHomology.fit().
| Attribute | Type | Description |
|---|---|---|
persistence_b0 |
list[(birth,death)] |
β₀ (component) lifetime pairs |
persistence_b1 |
list[dict] |
β₁ (loop) birth/death events |
betti_curve |
dict[ε→(β₀,β₁)] |
Betti numbers at each scale |
bottleneck_width |
float |
Max β₀ lifetime (signal strength) |
phase_gaps |
list[float] |
ε values of phase transitions |
max_epsilon |
float |
Filtration range used |
n_states |
int |
Number of input states |
metric |
str |
'fisher' or 'euclidean' |
result.n_phases() # estimated number of distinct regimes
result.has_cycles() # True if trajectory contains loops (trapped states)
result.summary() # human-readable summary string
Distance functions
from fisher_homology import fisher_arc, fisher_distance_matrix
from fisher_homology.distances import fisher_arc_position, fisher_gradient
# Scalar arc distance
d = fisher_arc(0.1, 0.9) # float in [0, π]
# Arc position (maps probability to manifold position)
pos = fisher_arc_position(0.5) # = π/2
# Fisher information (gradient of arc w.r.t. p)
info = fisher_gradient(0.3) # = 1/√(0.3 × 0.7)
# Full pairwise distance matrix
states = [[0.1, 0.2], [0.5, 0.6], [0.9, 0.8]]
D = fisher_distance_matrix(states) # 3×3 symmetric matrix
# Tail expansion: how much Fisher expands vs Euclidean at p=0.01
ratio = tail_expansion_ratio(0.01) # ≈ 10.0
Topology functions
from fisher_homology.topology import (
b0_persistence,
betti_curve,
persistence_diagram,
bottleneck_distance,
vietoris_rips_betti,
UnionFind,
)
# β₀ persistence from a distance matrix
pairs = b0_persistence(D, max_epsilon=5.0)
# Betti curve: (β₀, β₁) at each filtration step
curve = betti_curve(D, n_steps=50)
# Full persistence diagram
diag = persistence_diagram(D, n_steps=50)
# Bottleneck distance between two diagrams
dist = bottleneck_distance(diag_a['persistence_b0'],
diag_b['persistence_b0'])
# Vietoris-Rips complex at one ε
rips = vietoris_rips_betti(D, epsilon=1.0)
Utils
from fisher_homology.utils import (
validate_state_sequence,
normalize_states,
trajectory_summary,
)
# Validate and clip probability vectors
clean = validate_state_sequence(raw_states)
# Normalize to (0, 1)
normed = normalize_states(states, method='clip') # or 'scale'
# Descriptive statistics
stats = trajectory_summary(states)
# {'n_states': 15, 'n_dims': 2, 'mean_probs': [...],
# 'std_probs': [...], 'trajectory_length': 4.28}
Interpretation Guide
Phase transitions
A phase gap in the persistence diagram marks a ε value where a large connected component merge occurs — two topologically distinct regimes that were previously separate become reachable from each other.
Large gap → significant phase transition
Small gap → gradual drift, no sharp regime change
Cyclic trapping
A β₁ feature (loop) indicates the trajectory returned to a previously visited region of probability space without escaping. In protein folding this is a misfolding intermediate. In fraud detection it is a network that almost cascades but recovers. In clinical monitoring it is a patient oscillating between two states.
Bottleneck width interpretation
< 0.05 × max_ε → all states are in one continuous cloud
0.05–0.15 → weak phase structure
0.15–0.40 → moderate phase separation
> 0.40 → strong, well-separated phases
Fisher vs Euclidean comparison
both = ph.fit_transform(states, return_both_metrics=True)
fisher_phases = both['fisher'].n_phases()
euclidean_phases = both['euclidean'].n_phases()
if fisher_phases > euclidean_phases:
print("Fisher detects additional phase structure at the tails.")
print("Tail events are driving regime separation.")
Applications
The Fisher metric is particularly valuable for probability trajectories where:
- Rare events matter: fraud detection (
p ≈ 0.001), medical diagnosis, gravitational wave detection (p ≈ 0.9999) - Phase transitions are critical: protein folding intermediates, market regime shifts, clinical state changes
- Cycle detection is needed: misfolding loops, oscillating fraud networks, treatment resistance patterns
Running the Tests
# Install with dev dependencies
pip install fisher-homology[dev]
# Run tests
pytest tests/ -v
# Or directly
python tests/test_fisher_homology.py
All 44 tests are non-tautological — each verifies a mathematically provable property against analytical ground truth.
Mathematical Background
Fisher Information Metric
For a Bernoulli distribution parameterised by p, the Fisher information is:
I(p) = 1 / (p(1-p))
The geodesic distance in this Riemannian metric is the Hellinger arc length:
d_F(p, q) = 2 · |arcsin(√p) − arcsin(√q)|
This is equivalent to the angle between the square-root-transformed probability vectors on the unit sphere — a natural geometric interpretation.
Vietoris-Rips Filtration
Given T states with pairwise Fisher distances D[i,j], the Vietoris-Rips
complex VR_ε includes all simplices whose diameter is at most ε:
ε = 0: T isolated points, β₀ = T, β₁ = 0- As
εgrows: components merge (β₀ decreases), loops form and fill (β₁ varies) ε = ∞: fully connected, β₀ = 1, β₁ = 0
Persistent Homology
Features are tracked as (birth_ε, death_ε) pairs. Long-lived features
(large death − birth) are robust signal. Short-lived features are noise.
The stability theorem (Cohen-Steiner et al. 2007) guarantees: if the
input data changes by at most δ (in Fisher distance), the persistence
diagram changes by at most δ in bottleneck distance.
References
- Edelsbrunner, H. & Harer, J. (2010). Computational Topology: An Introduction. AMS.
- Rao, C. R. (1945). Information and the accuracy attainable in the estimation of statistical parameters. Bull. Calcutta Math. Soc. 37, 81–91.
- Fisher, R. A. (1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10(4), 507–521.
- Cohen-Steiner, D., Edelsbrunner, H. & Harer, J. (2007). Stability of persistence diagrams. Discrete & Computational Geometry 37(1), 103–120.
- Chazal, F. & Michel, B. (2021). An introduction to topological data analysis. Frontiers in Artificial Intelligence 4.
License
MIT License — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fisher_homology-1.0.1.tar.gz.
File metadata
- Download URL: fisher_homology-1.0.1.tar.gz
- Upload date:
- Size: 31.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c852cd6a86823fc8a7772c782655f153b323f68ca1277db0d571b8c16e78bad
|
|
| MD5 |
347a7e5749a9ce3c12f8321b4c4a3d8a
|
|
| BLAKE2b-256 |
aaa9ed1e9d1ca56d1ab87197563a49f146139702a54a854b0c698ff7485d63c2
|
File details
Details for the file fisher_homology-1.0.1-py3-none-any.whl.
File metadata
- Download URL: fisher_homology-1.0.1-py3-none-any.whl
- Upload date:
- Size: 19.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5194b518df3c14462de34c9c55848e5ff31b079b64853448fbdb251fbe96ef07
|
|
| MD5 |
5030c2ea29be0b3393f1b09a5021bbbc
|
|
| BLAKE2b-256 |
3384f1abdd10a925e53990f8ffde0a733c3d68913c70eea8968d21e77e8956bb
|