Skip to main content

Topological Entity Similarity Structure for Emergent Response Analysis

Project description

TESSERA

Topological Entity Similarity Structure for Emergent Response Analysis

A domain-general framework for detecting and exploiting geometric structure in multi-dimensional feature spaces for predictive classification.

Targeting: Nature Computational Science

Core Hypothesis

When entities are characterized by multi-dimensional features and outcomes depend on relational position in feature space, similarity network topology carries predictive information that direct feature-based classifiers miss.

Three Failure Geometries

TESSERA detects geometrically distinct response regions in entity similarity space:

  1. Ecotone — responses cluster at boundaries between stable operational domains (analogous to ecological transition zones)
  2. Isolated cluster — responses occupy a distinct region, separate from main domains
  3. Diffuse — responses scattered within domains, driven by individual feature thresholds

Three Demonstration Domains

Ecology Manufacturing HPC
Entity Reef survey Production run Job
Features Environmental conditions Sensor measurements Resource usage
Outcome Bleaching Defect Failure
Dataset Global Coral Bleaching DB UCI SECOM Synthetic / NØMAD
N ~35,000 1,567 5,000+

Validation Protocol

  1. Phase 1 — Ablation: GNN vs. MLP, RF, XGBoost, LogReg (justify network)
  2. Phase 2 — Similarity: 7 measures compared (Simpson, Cosine, Bray-Curtis, Jaccard, Pearson, Euclidean, Mahalanobis)
  3. Phase 3 — Bin sensitivity: n_bins = 2, 3, 4, 5 (Simpson only)
  4. Phase 4 — Temporal: 5-fold temporal cross-validation

Project Structure

tessera/
├── __init__.py                    # Package init (v0.1.0)
├── core/
│   ├── __init__.py
│   ├── similarity.py              # 7 similarity measures
│   └── network.py                 # Graph construction (threshold, kNN, weighted)
├── synthetic/
│   ├── __init__.py
│   ├── landscape.py               # Feature space topology builder
│   ├── outcomes.py                # Failure pattern assignment (3 types + mixed)
│   └── visualize.py               # Diagnostic plots
├── data_acquisition.py            # Real dataset download & preprocessing
├── test_generator.py              # Synthetic data validation
└── test_pipeline.py               # End-to-end pipeline test

Current Status

  • Synthetic data generator (3 patterns + mixed, with entity injection)
  • Similarity engine (7 measures, vectorized)
  • Network construction (threshold, kNN, weighted)
  • Pipeline validation (data → similarity → graph)
  • GNN model
  • Real dataset acquisition (coral bleaching, SECOM)
  • Validation pipeline (4 phases)
  • Methods section
  • Results
  • Introduction, Discussion, Conclusions

Paper Writing Order

  1. Methods
  2. Results
  3. Introduction
  4. Discussion
  5. Conclusions
  6. Abstract (last)

References

  • van Woesik, R. & Kratochwill, C. (2022). A global coral-bleaching database, 1980–2020. Scientific Data, 9(20).
  • McCann, M. & Johnston, A. (2008). SECOM Dataset. UCI ML Repository.
  • Tonini, J. (2025). NØMAD-HPC. Journal of Open Research Software.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tessera_ml-0.2.0.tar.gz (90.5 kB view details)

Uploaded Source

File details

Details for the file tessera_ml-0.2.0.tar.gz.

File metadata

  • Download URL: tessera_ml-0.2.0.tar.gz
  • Upload date:
  • Size: 90.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.21

File hashes

Hashes for tessera_ml-0.2.0.tar.gz
Algorithm Hash digest
SHA256 1952d60925b9b7931d9ff9b79a6ba9b20f2de3ba7ed108546a3173d67632afe5
MD5 96aa3aa159af9d2ac72aeaa9e2aaf88d
BLAKE2b-256 9a7126f5513f8bb1649615fb387a86c8ad9f00075ae454b49fbba49926f87a28

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page