Skip to main content

Riemannian Comparison Manifold Sort with GTC — exact + approximate sorting with outlier detection

Project description

HyperSort: Riemannian Comparison Manifold Sort with GTC

Exact sorting + approximate manifold sort + outlier detection.

HyperSort is a Python package that uses the HyperTensor Geometric Jury framework to provide two sort modes and per-element confidence scoring via a Riemannian Comparison Manifold with Geodesic Trajectory Cache (GTC).

Quick Start

from hypersort import hypersort, ComparisonManifold

# One-shot exact sort
result = hypersort([3.14, 1.41, 2.71, 1.73, 0.57])
print(result.sorted_data)  # [0.57, 1.41, 1.73, 2.71, 3.14]

# Build once, sort many (exact)
manifold = ComparisonManifold()
manifold.build(training_data)  # O(n²), one-time

result = manifold.sort(new_data)         # exact, always correct
result = manifold.sort_manifold(new_data)  # GTC approximate + verdicts
verdicts = manifold.classify(data)        # outlier detection only

Two Sort Modes

Method Ordering Speed Use case
sort() Exact (encoder dim 0) O(n log n) Correctness matters
sort_manifold() GTC approximate O(n·k + n log n) Speed + confidence + outlier detection

Outlier Detection

classify() and sort_manifold() assign each element a verdict tier:

Tier Meaning
RETRIEVE Close to cached trajectory — high confidence
AUGMENT Nearby — needs small correction
EXPAND Far but within manifold — low confidence
EXPLORE Out-of-distribution — possible anomaly

Statistical z-score check (|z| > 3 on sort key) automatically upgrades to EXPLORE.

manifold.build(training_data)
verdicts = manifold.classify(mixed_data)
for v in verdicts:
    if v.verdict_type == 'EXPLORE':
        print(f"Outlier detected! d={v.geodesic_distance:.4f}")

Configuration

Parameter Default Description
intrinsic_dim 0 (auto) Manifold dimension k. 0 = auto-tune via 95% explained variance
variance_ratio 0.95 Explained variance to retain when auto-tuning k
num_jurors 7 Trajectories consulted per query
cache_threshold 0.05 RETRIEVE tier geodesic distance threshold
use_gpu False Enable CuPy GPU acceleration (pip install cupy)

API Reference

One-shot

from hypersort import hypersort

result = hypersort(data)                      # exact sort
result = hypersort(data, mode='manifold')     # exact + verdicts + confidence
result = hypersort(data, compute_pairwise=False)  # skip O(n²) pairwise matrix

Reusable Workflow

from hypersort import ComparisonManifold, ManifoldConfig

manifold = ComparisonManifold()               # auto-tune k
# or: ManifoldConfig(intrinsic_dim=16, use_gpu=True)
manifold.build(training_data)                 # O(n²), one-time

# Exact sorting
result = manifold.sort(new_data)              # always correct
result = manifold.sort(new_data, compute_pairwise=True)  # + jury confidence

# GTC approximate sorting
result = manifold.sort_manifold(new_data)     # approximate positions + verdicts

# Outlier detection
verdicts = manifold.classify(data)            # batch
verdict  = manifold.classify_one(element)     # single element (streaming)

Persistence

manifold.save('manifold.npz')
loaded = ComparisonManifold.load('manifold.npz')

Verdict Tiers

from hypersort import VerdictTier

for v in result.verdicts:
    if v.verdict_type == VerdictTier.EXPLORE:
        print(f"Outlier: d={v.geodesic_distance:.4f}, conf={v.confidence:.4f}")

Honest Complexity

Phase Complexity
Build O(n·d·k + n²) — SVD + GTC construction (one-time)
sort() O(n log n + n·d) — exact argsort
sort_manifold() O(n·k + n log n) — GTC lookup + sort
classify() O(n·k) — geodesic distances to cache
classify_one() O(k) — single-element lookup

How It Works

Build Phase (one-time, O(n²))

Data → [Encoder] → dim 0 = sort key, dims 1+ = manifold features
     → [SVD on dims 1+] → PCA manifold basis [d-1, k]
     → [Normalize] → Unit Sphere S^{k-1} (trajectory cache)
     → [Sort by dim 0] → GTC indexed by ground-truth positions

Sort Phase

  • sort(): exact ordering by encoder dim 0 via argsort
  • sort_manifold(): project new data onto manifold → query GTC for nearest cached trajectories → approximate positions + verdict tiers
  • classify(): project → measure geodesic distance to GTC cache → assign RETRIEVE/AUGMENT/EXPAND/EXPLORE based on proximity and statistical z-score (|z| > 3 → EXPLORE)

Limitations

  • String encoder: first 7 chars for positional encoding (exact via Python ints)
  • GTC sort is approximate for out-of-distribution data
  • Manifold build is O(n²); best with build() once, sort_manifold() many
  • GPU acceleration requires CuPy (pip install cupy)

Citation

@software{stewart2026hypersort,
  title     = {{HyperSort}: {Riemannian} Comparison Manifold Sort},
  author    = {Stewart, William Ken Ohara},
  year      = {2026},
  publisher = {GitHub},
  journal   = {GitHub repository},
  url       = {https://github.com/NagusameCS/HyperTensor/tree/main/hypersort}
}

License

MIT — see LICENSE file.

Related Work

  • HyperTensor — The parent project
  • Papers I–XVIII in ARXIV_SUBMISSIONS/ — Full theoretical foundation
  • volume_extended.tex — Master compilation of all papers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hypersort-0.2.0.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hypersort-0.2.0-py3-none-any.whl (25.8 kB view details)

Uploaded Python 3

File details

Details for the file hypersort-0.2.0.tar.gz.

File metadata

  • Download URL: hypersort-0.2.0.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for hypersort-0.2.0.tar.gz
Algorithm Hash digest
SHA256 05609f96c7a276dd4341c4c595aad724b0ce87b15b3da9fc3594d8e214166f6f
MD5 21dd2c1acb3cfb681b1900d525002258
BLAKE2b-256 9a1243dc986caad3b959b438791e6893f2f0859f7df8dd45c3fc30edece4367d

See more details on using hashes here.

File details

Details for the file hypersort-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: hypersort-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 25.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for hypersort-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 26e1c326c849243ce086e2808bb4dc572bf5e2dadc447b79f981ebdec77cdbba
MD5 ef1ad2367ef9d748e23a391429bb53a1
BLAKE2b-256 319f286fe7a2e615de6c1ed9e7f90585df19209c19edc75616556d8fb8ed99a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page