Riemannian Comparison Manifold Sort with GTC — exact + approximate sorting with outlier detection
Project description
HyperSort: Riemannian Comparison Manifold Sort with GTC
Exact sorting + approximate manifold sort + outlier detection.
HyperSort is a Python package that uses the HyperTensor Geometric Jury framework to provide two sort modes and per-element confidence scoring via a Riemannian Comparison Manifold with Geodesic Trajectory Cache (GTC).
Quick Start
from hypersort import hypersort, ComparisonManifold
# One-shot exact sort
result = hypersort([3.14, 1.41, 2.71, 1.73, 0.57])
print(result.sorted_data) # [0.57, 1.41, 1.73, 2.71, 3.14]
# Build once, sort many (exact)
manifold = ComparisonManifold()
manifold.build(training_data) # O(n²), one-time
result = manifold.sort(new_data) # exact, always correct
result = manifold.sort_manifold(new_data) # GTC approximate + verdicts
verdicts = manifold.classify(data) # outlier detection only
Two Sort Modes
| Method | Ordering | Speed | Use case |
|---|---|---|---|
sort() |
Exact (encoder dim 0) | O(n log n) | Correctness matters |
sort_manifold() |
GTC approximate | O(n·k + n log n) | Speed + confidence + outlier detection |
Outlier Detection
classify() and sort_manifold() assign each element a verdict tier:
| Tier | Meaning |
|---|---|
RETRIEVE |
Close to cached trajectory — high confidence |
AUGMENT |
Nearby — needs small correction |
EXPAND |
Far but within manifold — low confidence |
EXPLORE |
Out-of-distribution — possible anomaly |
Statistical z-score check (|z| > 3 on sort key) automatically upgrades to EXPLORE.
manifold.build(training_data)
verdicts = manifold.classify(mixed_data)
for v in verdicts:
if v.verdict_type == 'EXPLORE':
print(f"Outlier detected! d={v.geodesic_distance:.4f}")
Configuration
| Parameter | Default | Description |
|---|---|---|
intrinsic_dim |
0 (auto) | Manifold dimension k. 0 = auto-tune via 95% explained variance |
variance_ratio |
0.95 | Explained variance to retain when auto-tuning k |
num_jurors |
7 | Trajectories consulted per query |
cache_threshold |
0.05 | RETRIEVE tier geodesic distance threshold |
use_gpu |
False | Enable CuPy GPU acceleration (pip install cupy) |
API Reference
One-shot
from hypersort import hypersort
result = hypersort(data) # exact sort
result = hypersort(data, mode='manifold') # exact + verdicts + confidence
result = hypersort(data, compute_pairwise=False) # skip O(n²) pairwise matrix
Reusable Workflow
from hypersort import ComparisonManifold, ManifoldConfig
manifold = ComparisonManifold() # auto-tune k
# or: ManifoldConfig(intrinsic_dim=16, use_gpu=True)
manifold.build(training_data) # O(n²), one-time
# Exact sorting
result = manifold.sort(new_data) # always correct
result = manifold.sort(new_data, compute_pairwise=True) # + jury confidence
# GTC approximate sorting
result = manifold.sort_manifold(new_data) # approximate positions + verdicts
# Outlier detection
verdicts = manifold.classify(data) # batch
verdict = manifold.classify_one(element) # single element (streaming)
Persistence
manifold.save('manifold.npz')
loaded = ComparisonManifold.load('manifold.npz')
Verdict Tiers
from hypersort import VerdictTier
for v in result.verdicts:
if v.verdict_type == VerdictTier.EXPLORE:
print(f"Outlier: d={v.geodesic_distance:.4f}, conf={v.confidence:.4f}")
Honest Complexity
| Phase | Complexity |
|---|---|
| Build | O(n·d·k + n²) — SVD + GTC construction (one-time) |
sort() |
O(n log n + n·d) — exact argsort |
sort_manifold() |
O(n·k + n log n) — GTC lookup + sort |
classify() |
O(n·k) — geodesic distances to cache |
classify_one() |
O(k) — single-element lookup |
How It Works
Build Phase (one-time, O(n²))
Data → [Encoder] → dim 0 = sort key, dims 1+ = manifold features
→ [SVD on dims 1+] → PCA manifold basis [d-1, k]
→ [Normalize] → Unit Sphere S^{k-1} (trajectory cache)
→ [Sort by dim 0] → GTC indexed by ground-truth positions
Sort Phase
sort(): exact ordering by encoder dim 0 via argsortsort_manifold(): project new data onto manifold → query GTC for nearest cached trajectories → approximate positions + verdict tiersclassify(): project → measure geodesic distance to GTC cache → assign RETRIEVE/AUGMENT/EXPAND/EXPLORE based on proximity and statistical z-score (|z| > 3 → EXPLORE)
Limitations
- String encoder: first 7 chars for positional encoding (exact via Python ints)
- GTC sort is approximate for out-of-distribution data
- Manifold build is O(n²); best with
build()once,sort_manifold()many - GPU acceleration requires CuPy (
pip install cupy)
Citation
@software{stewart2026hypersort,
title = {{HyperSort}: {Riemannian} Comparison Manifold Sort},
author = {Stewart, William Ken Ohara},
year = {2026},
publisher = {GitHub},
journal = {GitHub repository},
url = {https://github.com/NagusameCS/HyperTensor/tree/main/hypersort}
}
License
MIT — see LICENSE file.
Related Work
- HyperTensor — The parent project
- Papers I–XVIII in
ARXIV_SUBMISSIONS/— Full theoretical foundation volume_extended.tex— Master compilation of all papers
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hypersort-0.2.0.tar.gz.
File metadata
- Download URL: hypersort-0.2.0.tar.gz
- Upload date:
- Size: 25.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05609f96c7a276dd4341c4c595aad724b0ce87b15b3da9fc3594d8e214166f6f
|
|
| MD5 |
21dd2c1acb3cfb681b1900d525002258
|
|
| BLAKE2b-256 |
9a1243dc986caad3b959b438791e6893f2f0859f7df8dd45c3fc30edece4367d
|
File details
Details for the file hypersort-0.2.0-py3-none-any.whl.
File metadata
- Download URL: hypersort-0.2.0-py3-none-any.whl
- Upload date:
- Size: 25.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26e1c326c849243ce086e2808bb4dc572bf5e2dadc447b79f981ebdec77cdbba
|
|
| MD5 |
ef1ad2367ef9d748e23a391429bb53a1
|
|
| BLAKE2b-256 |
319f286fe7a2e615de6c1ed9e7f90585df19209c19edc75616556d8fb8ed99a6
|