AJCC TNM cancer staging engine: rules-walker + observation derivation
Project description
tnmhelper — AJCC TNM cancer staging engine
A small, rules-driven engine that maps clinical observations (tumor
size, invasion depth, node count, biomarkers, …) to an AJCC stage
group. Each (organ, AJCC edition) is described by four declarative
files: a Python model definition, a CSV rules table, a criteria JSON,
and a derivation JSON. The engine walks specificity-sorted rules at
query time — no precomputed flattening, no separate cache.
This repository is organised into four layers and six components
so the data, logic, public API, and three consumers (GUI, HTTP, CLI)
can evolve independently. Adding a new (organ, edition) is still a
drop-in change: four new files, zero edits to the core pipeline.
Layers and components
| Layer | Component | Lives at | Role |
|---|---|---|---|
| 1 Data | tnmhelper_data/ |
sibling top-level dir | Source of truth: rules, criteria, derivations, and per-organ model .py files |
| 2 Backend | tnmhelper.backend |
tnmhelper/backend/ |
Pure logic: rule walker, derivation engine, lint, shared types |
| 3 Python API | tnmhelper (top-level) |
tnmhelper/api.py + __init__.py |
Stable importable surface for downstream consumers (e.g. dspy-based extractors) |
| 4 Consumers | tnmhelper.consumers.gui |
tnmhelper/consumers/gui.py |
NiceGUI browser app |
tnmhelper.consumers.http |
tnmhelper/consumers/http.py |
stdlib HTTP server | |
tnmhelper.consumers.cli |
tnmhelper/consumers/cli.py |
argparse CLI |
The three consumers depend only on the Python API; they do not import
backend internals. The backend reads the data layer through a
DataProvider protocol that
abstracts over the filesystem tree and the optional zip bundle.
Repository tree
tnmhelper/ Layer 2-4 — Python package (no data)
__init__.py public API re-exports
api.py Layer 3 facade (stage / derive_tnm / …)
schema.py dataclasses: ModelSpec, ObservableSpec, DerivedStage
backend/ Layer 2
engine.py rules walker (TNM -> Stage)
derive.py observations -> TNM
lint.py validation (CSV / criteria / derivation)
_shared.py Edition, Biomarker, Classification, YesNo, StagingModel
_machine/ contract.json + new_model_template.py
models/ empty namespace; __path__ injected at runtime
__init__.py MODEL_REGISTRY, auto-discovery
data_provider/ runtime data access abstraction
filesystem.py reads tnmhelper_data/ directly
zipbundle.py reads a .zip bundle via zipimport + ZipFile
bundle/ export tool
__main__.py `python -m tnmhelper.bundle ...`
exporter.py pack tnmhelper_data/ into a .zip
consumers/ Layer 4 (each importing only tnmhelper)
cli.py, http.py, gui.py
requirements-gui.txt
tnmhelper_data/ Layer 1 — source of truth
models/ajcc<X>/<organ>.py model definition (StrEnums + StagingModel)
ajcc<X>/<organ>/
rules.csv human-edited rules table
criteria.json TNM category descriptions
derivation.json observation -> TNM rules
tnmhelper_pdfsource/ AJCC source PDFs (reference; excluded from bundle)
tests/
smoke.py end-to-end smoke test (replaces main.py)
bench.py per-query latency microbenchmark
pyproject.toml package metadata + console entry points
AGENTS.md onboarding for coding agents
Installation
pip install -e . # editable install (dev)
pip install -e ".[gui]" # add NiceGUI for the GUI consumer
After install, three console scripts are available:
tnmhelper — CLI (list / stage / derive / interactive / …)
tnmhelper-http — HTTP server on port 8000 by default
tnmhelper-gui — NiceGUI browser app on port 8080 by default
Quickstart — as a Python library
import tnmhelper
tnmhelper.set_data_source(None) # autodetect (TNMHELPER_DATA env,
# packaged bundle, or ./tnmhelper_data/)
tnmhelper.organs()
# ['ampulla', 'breast', 'cervix', 'colon', 'esophagus_adeno', ...]
tnmhelper.observable_schema("lung", "AJCC 9")
# {'size_cm': ObservableSpec(name='size_cm', type='number', unit='cm', ...),
# 'histology': ObservableSpec(...), ...}
tnmhelper.stage_from_observations(
"lung", "AJCC 9",
{"size_cm": 1.5},
Classification="c", DescY="No", DescR="No", DescM="No",
)
# DerivedStage(organ='lung', edition='AJCC 9', T='T1b', N='N0', M='M0',
# stage='IA2', source='ajcc9/lung/rules.csv:row10', ...)
The full surface is tnmhelper.set_data_source, organs,
editions_for, model_spec, observable_schema, stage,
derive_tnm, stage_from_observations, explain plus the
ModelSpec / ObservableSpec / DerivedStage dataclasses. See
tnmhelper/api.py for full docstrings.
Quickstart — as a consumer
tnmhelper list # all (organ, edition) pairs
tnmhelper interactive # prompted form-fill loop
tnmhelper stage breast --edition "AJCC 8" \
--T T3 --N N0 --M M0 --Classification c \
--DescY No --DescR No --DescM No --Grade G3 \
--HER2 Negative --ER Positive --PR Negative
tnmhelper-http --port 8000 # HTTP API
tnmhelper-gui --port 8080 # browser GUI
Validation
After any data or model change:
python -m tnmhelper.backend.lint # header / conflict / enum / edition checks
python -m tests.smoke # end-to-end: every model.examples + derivation.examples
tests/smoke.py exits non-zero on any failure. Use it as the
integration check — there is no separate test suite.
Adding a new (organ, edition) — still drop-in
Four new files, no edits to anything else:
tnmhelper_data/models/<edition_slug>/<name>.py— defines the TNMStrEnums, aNamedTuplestate class, and a module-levelMODEL = StagingModel(...). See tnmhelper_data/models/README.md.tnmhelper_data/<edition_slug>/<name>/rules.csv— header MUST be exactlylist(MODEL.columns) + ["Stage"]; cells are enum literals orANY.tnmhelper_data/<edition_slug>/<name>/criteria.json—editionfield MUST matchMODEL.edition.value.tnmhelper_data/<edition_slug>/<name>/derivation.json— observable declarations + per-axis ordered rules.
Then run python -m tnmhelper.backend.lint followed by
python -m tests.smoke.
Data bundling — for installs on other systems
The data layer can be packed into a single .zip for shipping. See
tnmhelper/bundle/README.md for the full
guide. Quick reference:
python -m tnmhelper.bundle export --out tnmhelper-data.zip --verify
Then on the target system, after pip install tnmhelper:
tnmhelper.set_data_source("/path/to/tnmhelper-data.zip")
Or set TNMHELPER_DATA=/path/to/tnmhelper-data.zip in the environment
and call tnmhelper.set_data_source(None) — the autodetect chain picks
up the env var. The bundle includes the per-organ .py model files;
Python's stdlib zipimport loads them directly from the archive.
Why no flattened cache
An earlier version of this engine precomputed a flattened JSONL lookup per organ. That gave O(1) queries but ballooned to ~80 MB per organ once high-cardinality wildcards (breast) expanded. The current rules-walker design keeps disk to the original ~kilobyte CSV and memory to a few-hundred-rule list per organ; per-query cost is O(rules) with early exit, well under a millisecond for any organ modeled here. Favourable trade-off when artifact size and startup matter (bundle deployment, GUI cold-start).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tnmhelper-0.1.0.tar.gz.
File metadata
- Download URL: tnmhelper-0.1.0.tar.gz
- Upload date:
- Size: 45.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc05d6e640b17fde3eaafa1edc806da78f72aac8fac8cd790dee8227b2d3970b
|
|
| MD5 |
6db78320d4d70b7ede01ee91c5eb66ce
|
|
| BLAKE2b-256 |
64e0d6b9482d3b56b8ab34c9ab04bd7101c810b5b55cd7d5b14bac3bb09de6b8
|
Provenance
The following attestation bundles were made for tnmhelper-0.1.0.tar.gz:
Publisher:
release.yml on kblab2024/tnmhelper
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tnmhelper-0.1.0.tar.gz -
Subject digest:
bc05d6e640b17fde3eaafa1edc806da78f72aac8fac8cd790dee8227b2d3970b - Sigstore transparency entry: 1614724179
- Sigstore integration time:
-
Permalink:
kblab2024/tnmhelper@3caff9b7d6bd18542c5556c22ba2232f9ad05990 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/kblab2024
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3caff9b7d6bd18542c5556c22ba2232f9ad05990 -
Trigger Event:
push
-
Statement type:
File details
Details for the file tnmhelper-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tnmhelper-0.1.0-py3-none-any.whl
- Upload date:
- Size: 52.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12c2be93d237cf0c5899549e26d6f7ab3e8f25749e5470531cfd9b3239931937
|
|
| MD5 |
26a05e3d88323010e20f9d31e7f2e4e2
|
|
| BLAKE2b-256 |
7ae3694ce1957a58a1a886aced9f45838f82321117cabec2388b5052c002443f
|
Provenance
The following attestation bundles were made for tnmhelper-0.1.0-py3-none-any.whl:
Publisher:
release.yml on kblab2024/tnmhelper
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tnmhelper-0.1.0-py3-none-any.whl -
Subject digest:
12c2be93d237cf0c5899549e26d6f7ab3e8f25749e5470531cfd9b3239931937 - Sigstore transparency entry: 1614724221
- Sigstore integration time:
-
Permalink:
kblab2024/tnmhelper@3caff9b7d6bd18542c5556c22ba2232f9ad05990 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/kblab2024
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3caff9b7d6bd18542c5556c22ba2232f9ad05990 -
Trigger Event:
push
-
Statement type: