Skip to main content

AJCC TNM cancer staging engine: rules-walker + observation derivation

Project description

tnmhelper — AJCC TNM cancer staging engine

A small, rules-driven engine that maps clinical observations (tumor size, invasion depth, node count, biomarkers, …) to an AJCC stage group. Each (organ, AJCC edition) is described by four declarative files: a Python model definition, a CSV rules table, a criteria JSON, and a derivation JSON. The engine walks specificity-sorted rules at query time — no precomputed flattening, no separate cache.

This repository is organised into four layers and six components so the data, logic, public API, and three consumers (GUI, HTTP, CLI) can evolve independently. Adding a new (organ, edition) is still a drop-in change: four new files, zero edits to the core pipeline.

Layers and components

Layer Component Lives at Role
1 Data tnmhelper_data/ sibling top-level dir Source of truth: rules, criteria, derivations, and per-organ model .py files
2 Backend tnmhelper.backend tnmhelper/backend/ Pure logic: rule walker, derivation engine, lint, shared types
3 Python API tnmhelper (top-level) tnmhelper/api.py + __init__.py Stable importable surface for downstream consumers (e.g. dspy-based extractors)
4 Consumers tnmhelper.consumers.gui tnmhelper/consumers/gui.py NiceGUI browser app
tnmhelper.consumers.http tnmhelper/consumers/http.py stdlib HTTP server
tnmhelper.consumers.cli tnmhelper/consumers/cli.py argparse CLI

The three consumers depend only on the Python API; they do not import backend internals. The backend reads the data layer through a DataProvider protocol that abstracts over the filesystem tree and the optional zip bundle.

Repository tree

tnmhelper/                      Layer 2-4 — Python package (no data)
  __init__.py                   public API re-exports
  api.py                        Layer 3 facade (stage / derive_tnm / …)
  schema.py                     dataclasses: ModelSpec, ObservableSpec, DerivedStage
  backend/                      Layer 2
    engine.py                   rules walker (TNM -> Stage)
    derive.py                   observations -> TNM
    lint.py                     validation (CSV / criteria / derivation)
    _shared.py                  Edition, Biomarker, Classification, YesNo, StagingModel
    _machine/                   contract.json + new_model_template.py
  models/                       empty namespace; __path__ injected at runtime
    __init__.py                 MODEL_REGISTRY, auto-discovery
  data_provider/                runtime data access abstraction
    filesystem.py               reads tnmhelper_data/ directly
    zipbundle.py                reads a .zip bundle via zipimport + ZipFile
  bundle/                       export tool
    __main__.py                 `python -m tnmhelper.bundle ...`
    exporter.py                 pack tnmhelper_data/ into a .zip
  consumers/                    Layer 4 (each importing only tnmhelper)
    cli.py, http.py, gui.py
    requirements-gui.txt

tnmhelper_data/                 Layer 1 — source of truth
  models/ajcc<X>/<organ>.py     model definition (StrEnums + StagingModel)
  ajcc<X>/<organ>/
    rules.csv                   human-edited rules table
    criteria.json               TNM category descriptions
    derivation.json             observation -> TNM rules

tnmhelper_pdfsource/            AJCC source PDFs (reference; excluded from bundle)
tests/
  smoke.py                      end-to-end smoke test (replaces main.py)
  bench.py                      per-query latency microbenchmark
pyproject.toml                  package metadata + console entry points
AGENTS.md                       onboarding for coding agents

Installation

pip install -e .                   # editable install (dev)
pip install -e ".[gui]"            # add NiceGUI for the GUI consumer

After install, three console scripts are available:

tnmhelper       — CLI (list / stage / derive / interactive / …)
tnmhelper-http  — HTTP server on port 8000 by default
tnmhelper-gui   — NiceGUI browser app on port 8080 by default

Quickstart — as a Python library

import tnmhelper

tnmhelper.set_data_source(None)         # autodetect (TNMHELPER_DATA env,
                                        # packaged bundle, or ./tnmhelper_data/)

tnmhelper.organs()
# ['ampulla', 'breast', 'cervix', 'colon', 'esophagus_adeno', ...]

tnmhelper.observable_schema("lung", "AJCC 9")
# {'size_cm': ObservableSpec(name='size_cm', type='number', unit='cm', ...),
#  'histology': ObservableSpec(...), ...}

tnmhelper.stage_from_observations(
    "lung", "AJCC 9",
    {"size_cm": 1.5},
    Classification="c", DescY="No", DescR="No", DescM="No",
)
# DerivedStage(organ='lung', edition='AJCC 9', T='T1b', N='N0', M='M0',
#              stage='IA2', source='ajcc9/lung/rules.csv:row10', ...)

The full surface is tnmhelper.set_data_source, organs, editions_for, model_spec, observable_schema, stage, derive_tnm, stage_from_observations, explain plus the ModelSpec / ObservableSpec / DerivedStage dataclasses. See tnmhelper/api.py for full docstrings.

Quickstart — as a consumer

tnmhelper list                                          # all (organ, edition) pairs
tnmhelper interactive                                   # prompted form-fill loop
tnmhelper stage breast --edition "AJCC 8" \
    --T T3 --N N0 --M M0 --Classification c \
    --DescY No --DescR No --DescM No --Grade G3 \
    --HER2 Negative --ER Positive --PR Negative

tnmhelper-http --port 8000                              # HTTP API
tnmhelper-gui  --port 8080                              # browser GUI

Validation

After any data or model change:

python -m tnmhelper.backend.lint     # header / conflict / enum / edition checks
python -m tests.smoke                # end-to-end: every model.examples + derivation.examples

tests/smoke.py exits non-zero on any failure. Use it as the integration check — there is no separate test suite.

Adding a new (organ, edition) — still drop-in

Four new files, no edits to anything else:

  1. tnmhelper_data/models/<edition_slug>/<name>.py — defines the TNM StrEnums, a NamedTuple state class, and a module-level MODEL = StagingModel(...). See tnmhelper_data/models/README.md.
  2. tnmhelper_data/<edition_slug>/<name>/rules.csv — header MUST be exactly list(MODEL.columns) + ["Stage"]; cells are enum literals or ANY.
  3. tnmhelper_data/<edition_slug>/<name>/criteria.jsonedition field MUST match MODEL.edition.value.
  4. tnmhelper_data/<edition_slug>/<name>/derivation.json — observable declarations + per-axis ordered rules.

Then run python -m tnmhelper.backend.lint followed by python -m tests.smoke.

Data bundling — for installs on other systems

The data layer can be packed into a single .zip for shipping. See tnmhelper/bundle/README.md for the full guide. Quick reference:

python -m tnmhelper.bundle export --out tnmhelper-data.zip --verify

Then on the target system, after pip install tnmhelper:

tnmhelper.set_data_source("/path/to/tnmhelper-data.zip")

Or set TNMHELPER_DATA=/path/to/tnmhelper-data.zip in the environment and call tnmhelper.set_data_source(None) — the autodetect chain picks up the env var. The bundle includes the per-organ .py model files; Python's stdlib zipimport loads them directly from the archive.

Why no flattened cache

An earlier version of this engine precomputed a flattened JSONL lookup per organ. That gave O(1) queries but ballooned to ~80 MB per organ once high-cardinality wildcards (breast) expanded. The current rules-walker design keeps disk to the original ~kilobyte CSV and memory to a few-hundred-rule list per organ; per-query cost is O(rules) with early exit, well under a millisecond for any organ modeled here. Favourable trade-off when artifact size and startup matter (bundle deployment, GUI cold-start).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tnmhelper-0.1.0.tar.gz (45.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tnmhelper-0.1.0-py3-none-any.whl (52.2 kB view details)

Uploaded Python 3

File details

Details for the file tnmhelper-0.1.0.tar.gz.

File metadata

  • Download URL: tnmhelper-0.1.0.tar.gz
  • Upload date:
  • Size: 45.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tnmhelper-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bc05d6e640b17fde3eaafa1edc806da78f72aac8fac8cd790dee8227b2d3970b
MD5 6db78320d4d70b7ede01ee91c5eb66ce
BLAKE2b-256 64e0d6b9482d3b56b8ab34c9ab04bd7101c810b5b55cd7d5b14bac3bb09de6b8

See more details on using hashes here.

Provenance

The following attestation bundles were made for tnmhelper-0.1.0.tar.gz:

Publisher: release.yml on kblab2024/tnmhelper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tnmhelper-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tnmhelper-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 52.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tnmhelper-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 12c2be93d237cf0c5899549e26d6f7ab3e8f25749e5470531cfd9b3239931937
MD5 26a05e3d88323010e20f9d31e7f2e4e2
BLAKE2b-256 7ae3694ce1957a58a1a886aced9f45838f82321117cabec2388b5052c002443f

See more details on using hashes here.

Provenance

The following attestation bundles were made for tnmhelper-0.1.0-py3-none-any.whl:

Publisher: release.yml on kblab2024/tnmhelper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page