Manifold-aware semantic and relational affinity metrics using PHATE
Project description
PHATE Manifold Metrics
Manifold-aware semantic and relational affinity metrics using PHATE.
Overview
Compute Semantic Affinity (SA) and Relational Affinity (RA) metrics that leverage manifold geometry to capture non-Euclidean structure in embedding spaces.
Key Features
✨ Multi-scale Analysis: Compare metrics at t=1 (baseline) vs t=5-6 (manifold) ✨ Multiple RA Variants: Euclidean, Geodesic, Diffusion ✨ Clustering-free SA: Distribution-based (no labels required) ✨ Analogy Support: Specialized 4-word analogy methods ✨ Optional Loaders: FastText, LaBSE, Ollama, OpenRouter ✨ Dataset Utilities: CSV loading and parsing
Installation
# Core metrics only
pip install phate-manifold-metrics
# With embedding loaders
pip install phate-manifold-metrics[embeddings]
# Development (includes pytest, black, mypy)
pip install phate-manifold-metrics[all]
Quick Start
import numpy as np
from phate_manifold_metrics import PhateManifoldMetrics
# Load/generate embeddings
embeddings = np.random.randn(100, 384)
# Initialize & fit
metrics = PhateManifoldMetrics(knn=5, t=6)
metrics.fit(embeddings)
# Define word pairs
pairs = [(0,1), (2,3), (4,5)]
# Compute SA
sa = metrics.compute_semantic_affinity(pairs)
print(f"SA: {sa['sa_score']:.3f}")
# Compute RA variants
ra_euc = metrics.compute_relational_affinity_euc(pairs)
ra_geo = metrics.compute_relational_affinity_geo(pairs)
ra_dif = metrics.compute_relational_affinity_dif(pairs)
print(f"RA_euc: {ra_euc['ra_euc_score']:.3f}")
print(f"RA_geo: {ra_geo['ra_geo_score']:.3f}")
print(f"RA_dif: {ra_dif['ra_dif_score']:.3f}")
CLI Usage
# Basic test
phate-metrics --knn 5 --t 6
# Dual-scale analysis
phate-metrics --dual-scale
# Euclidean metric
phate-metrics --metric euclidean
Metrics Explained
Semantic Affinity (SA)
Clustering quality in manifold space:
SA = 1 / (1 + CV)
where CV = std(distances) / mean(distances)
- Range: [0, 1], higher = better clustering
- No labels required
Relational Affinity (RA)
Directional alignment of relational vectors:
Statistical RA (word pairs):
RA_euc: Euclidean (flat space baseline)RA_geo: Geodesic (k-NN graph shortest paths)RA_dif: Diffusion (PHATE manifold)- Range: [-1, 1], higher = stronger alignment
Analogy RA (4-word test cases a:b::c:d):
RA_euc_analogy: Euclidean parallelogramRA_geo_analogy: Geodesic parallelogram- Range: [0, 1], higher = stronger analogy
Parameters
| Parameter | Description | Recommendation |
|---|---|---|
knn |
k-Nearest neighbors | 5-10 (start with 5) |
t |
Diffusion time | 1 (baseline), 6 (manifold) |
metric |
Distance metric | 'cosine' (normalized), 'euclidean' |
Optional: Embedding Loaders
FastText
from phate_manifold_metrics.embeddings import load_fasttext_from_extracted
embeddings = load_fasttext_from_extracted(["cat", "dog"], lang='en')
LaBSE
from phate_manifold_metrics.embeddings import load_labse_embeddings
embeddings = load_labse_embeddings(["hello", "你好", "hola"])
Ollama
from phate_manifold_metrics.embeddings.ollama import get_ollama_embeddings_fixed
embeddings = get_ollama_embeddings_fixed(
["cat", "dog"],
model_name="snowflake-arctic-embed2"
)
OpenRouter API
import os
from phate_manifold_metrics.embeddings.openrouter import load_openrouter_embeddings
os.environ['OPENROUTER_API_KEY'] = 'your-key'
embeddings = load_openrouter_embeddings(
["hello", "world"],
model_path="qwen/qwen3-embedding-8b",
model_name="Qwen3-8B"
)
Documentation
Full API documentation available in docstrings:
from phate_manifold_metrics import PhateManifoldMetrics
help(PhateManifoldMetrics)
Citation
@software{phate_manifold_metrics,
title = {PHATE Manifold Metrics},
author = {Digital Duck},
year = {2026},
url = {https://github.com/digital-duck/phate-manifold-metrics}
}
References
- PHATE: Moon et al., Nature Biotechnology 2019
- Diffusion Distance: Coifman & Lafon, Applied and Computational Harmonic Analysis 2006
License
MIT License - Copyright (c) 2026 Digital Duck
Authors
Digital Duck (Wen + Claude Sonnet 4.5 + Google Gemini 2.5)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file phate_manifold_metrics-1.0.0.tar.gz.
File metadata
- Download URL: phate_manifold_metrics-1.0.0.tar.gz
- Upload date:
- Size: 26.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c4fe93470f51b6600039479a0c0adb3e43e21b0164d7bbdaea69d44dbb21154
|
|
| MD5 |
2ccb6e023efc278fbf0164a45192bee8
|
|
| BLAKE2b-256 |
36fe24f89b694c570d597bb19b1d3aa84e0d77f98766192790a51975574b05ca
|
File details
Details for the file phate_manifold_metrics-1.0.0-py3-none-any.whl.
File metadata
- Download URL: phate_manifold_metrics-1.0.0-py3-none-any.whl
- Upload date:
- Size: 29.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59b9f0735bb9d8f8a107d3e74bd1535417f5b5fc7ecdbacae3e7c6443231f4dd
|
|
| MD5 |
827e3b3005ffcc68f4698c2a03d28cf8
|
|
| BLAKE2b-256 |
7cc42f3d7bc2337887ded9f926c40cb6c136df8f03fddeca04762a8b1010dda8
|