Claim-bounded monitoring of AI-enabled medical devices: profile a device, classify what postmarket evidence can substantiate, and retrieve comparable FDA precedents.

These details have not been verified by PyPI

Project links

Project description

title: claimbounded emoji: 🏥 colorFrom: blue colorTo: indigo sdk: gradio sdk_version: "6.19.0" app_file: app.py pinned: false license: mit short_description: Claim-bounded monitoring of AI-enabled medical devices

claimbounded

Claim-Bounded Monitoring of AI-Enabled Medical Devices

Try it now — no install required

→ Open the live app on HuggingFace Spaces

claimbounded is a regulatory science Python package grounded in a structured audit of 1,400 public FDA authorization summaries for AI-enabled medical devices (510(k) and De Novo). It answers a foundational question in AI medical device oversight:

What is the strongest performance claim a health system can substantiate using only the data routine deployment naturally generates — and how far does that fall short of what the device was authorized on?

The package classifies any device along five primary variables validated against human reviewers (all κ ≥ 0.75):

Variable	What it captures
Postmarket evaluability class	What kind of correctness signal routine deployment produces (surrogate-only, correction-evaluable, delayed-evaluable, directly auditable)
Authorization endpoint recoverability	Whether the specific performance endpoint the device was cleared on can be recovered from routine data — and at what cost
Strongest auditable postmarket claim	The highest claim level routine evidence can support without a new study
Postmarket audit burden	The evidence work required to reconstruct the authorization endpoint
Routine data claim type	Whether routine data supports the same endpoint, a clinical proxy, a workflow proxy, or only technical monitoring

Key empirical findings from the 1,400-device corpus (publicly available FDA summaries):

85% of authorized AI devices produce only surrogate-only evidence in deployment — no natural correctness signal
62% have a claim ceiling of workflow performance — alert rates and output volume, not clinical accuracy
51% have proxy-only recoverability — the authorization endpoint cannot be recovered from routine data at all
Only 1 in 1,400 devices is directly auditable on its authorization endpoint from routine deployment data
96% have no PCCP; 99% have no device-specific postmarket monitoring plan

Who Is This For?

Audience	How `claimbounded` helps
Regulators	Assess whether a manufacturer's marketed postmarket monitoring claim is supportable from the evidence their routine deployment generates. Cross-reference real FDA submission numbers from the precedent table on `accessdata.fda.gov`. See what fraction of comparable authorized devices share the same recoverability class.
Device manufacturers	Know your claim ceiling before your device ships. The Manufacturer Design Requirements section tells you exactly which logging, export, and identifier features would raise that ceiling. The Landscape Context shows how your device compares to 1,400 authorized peers.
Health systems	Use the Procurement Questions as a vendor checklist before deployment. Know the strongest monitoring claim your routine data supports — and verify it before signing a contract. The package surfaces whether comparable authorized devices can substantiate their marketed claims.

Installation

pip install claimbounded

With interactive UI (adds Gradio + python-docx):

pip install "claimbounded[ui]"

From source:

git clone https://github.com/yanisvdc/claimbounded
cd claimbounded
pip install -e ".[ui]"

Quick Start

Launch the interactive UI

claimbounded ui

Opens at http://localhost:7860. All processing runs locally — no data leaves your machine.

Python API

from claimbounded import (
    profile_device,
    classify_evaluability_class,
    classify_recoverability,
    generate_monitoring_package,
)

profile = profile_device({
    "device_name": "Acme LVO Triage",
    "device_function": "triage_notification",
    "authorization_endpoint_type": "diagnostic_accuracy",
    "authorization_ground_truth_modality": "expert_reader_panel",
    "routine_postmarket_evidence_stream": "workflow_logs",
    "endpoint_linked_to_ai_output": "possible_but_not_described",
    "human_correction_available": "no",
})

# Primary V4 variables
print(classify_evaluability_class(profile))
# → "surrogate_only"  (85% of authorized AI devices)

print(classify_recoverability(profile))
# → "recoverable_with_chart_review"  (expert panel GT; images retained in PACS)

# Full monitoring package
pkg = generate_monitoring_package(profile, k=8)
print(pkg["claim_profile"]["routine_evidence_claim_ceiling"])
# → "workflow_performance"

print(pkg["claim_profile"]["recoverability_label"])
# → "Recoverable with chart/image review"

# Landscape context: how this device compares to the 1,400-device corpus
ctx = pkg["landscape_context"]
print(f"{ctx['ceiling_pct']}% of FDA-authorized AI devices share this claim ceiling")
# → "62.2% of FDA-authorized AI devices share this claim ceiling"

CLI

claimbounded report examples/example_profiles/lvo_triage.json
claimbounded precedents examples/example_profiles/lvo_triage.json --mode hybrid -k 10
claimbounded lookup K192383
claimbounded search "large vessel occlusion"
claimbounded search "oncology"

The Five Primary Variables

Postmarket evaluability class

What kind of correctness signal routine deployment naturally produces — before any additional effort.

Class	Description	Prevalence
`surrogate_only`	Deployment produces outputs and logs but no natural correctness signal	85% of corpus
`correction_evaluable`	Physician edits/confirmations explicitly captured and stored	13%
`delayed_evaluable`	Clinical outcome accumulates naturally over time in EHR records	1%
`workflow_endpoint_directly_auditable`	Authorization endpoint is itself a workflow metric, co-logged in deployment	<1%
`closed_loop_evaluable`	AI output and ground truth both automatically co-logged	<1%

Authorization endpoint recoverability

Whether the specific authorization endpoint can be recovered and re-measured.

Class	Description	Prevalence
`proxy_only`	Endpoint NOT recoverable; only operational proxies available	51% of corpus
`recoverable_with_chart_review`	Endpoint recoverable but requires expert re-annotation (major effort)	43%
`recoverable_with_linkage`	Endpoint recoverable via data engineering on structured records	4%
`not_recoverable`	Endpoint not recoverable AND no operational proxy exists	2%
`directly_auditable`	Endpoint re-measurable from routine deployment data	<0.1% (1 in 1,400)

The Claim Hierarchy

The strongest monitoring claim routine evidence can support:

Level	Claim	Prevalence in corpus
7	Clinical accuracy or calibration	0% (no device reaches this from routine data)
6	Output quality / measurement agreement	2.5%
5	Human–machine concordance	11%
4	Workflow performance	62%
3	Technical pipeline stability	23%
2	Utilization only	—
1	No performance claim auditable	1%

Precedent Retrieval

claimbounded retrieves comparable FDA-authorized devices using a hybrid scoring function:

Signal	Weight	Fields
Regulatory identity	35%	disease area, clinical domain, device function, submission pathway
Evidence structure	30%	endpoint type, recoverability, ground truth, claim ceiling, evaluability class, audit burden
Text similarity (BM25)	20%	authorization endpoint description, supporting quotes
Evidence-gap matching	15%	audit burden, monitoring implication

Retrieval modes:

hybrid — weighted blend (recommended)
like_for_like — same regulatory and clinical identity
adjacent — same postmarket-evidence problem, any device type
claim_gap — same divergence between authorization endpoint and ceiling

Interactive UI

Launch with claimbounded ui and navigate three tabs:

① Profile & Report

Fill in a device description using structured dropdowns (V4 FDA-Panel vocabulary). Click Generate Report to receive:

Claim hierarchy — visual ceiling and authorization gap
Postmarket evaluability class — what correctness signal deployment produces, with full V4 codebook definition
Authorization endpoint recoverability — whether/how the clearing endpoint can be recovered
Landscape context — how this device compares to 1,400 authorized peers (% sharing same ceiling, recoverability, evaluability)
Minimum audit dataset, Manufacturer design requirements, Procurement questions
Comparable FDA precedents — up to 20 real 510(k)/De Novo submission numbers with scoring
Downloadable HTML report and Word document (.docx)

② Corpus Search

Search the 1,400-device corpus by device name, manufacturer, authorization endpoint, disease area, or clinical domain. Results render as a full stakeholder report with evaluability class, recoverability, PCCP status, and monitoring plan notes.

③ Submission Lookup

Enter a 510(k) or De Novo submission number to retrieve the complete coded profile — including evaluability class, recoverability, claim ceiling, supporting quotes, and PCCP/monitoring plan context.

Validation

Five primary variables validated against two independent human reviewers on a 200-record stratified sample (pre-registered before full extraction):

Variable	κ (R1 vs R2)	95% CI	Gate
`authorization_endpoint_recoverability`	0.759	[0.68, 0.83]	✓ PASS
`routine_data_claim_type`	0.837	[0.76, 0.91]	✓ PASS
`postmarket_evaluability_class`	0.768	[0.63, 0.88]	✓ PASS
`strongest_auditable_postmarket_claim`	0.821	[0.74, 0.89]	✓ PASS
`postmarket_audit_burden`	0.832	[0.76, 0.90]	✓ PASS

Pre-registration: doi:10.17605/OSF.IO/74WAP

Public API Reference

from claimbounded import (
    # Profile a device
    profile_device,               # dict → DeviceEvidenceProfile
    normalize_device_record,      # dict → dict (canonical field set)
    load_corpus,                  # → list[DeviceEvidenceProfile]
    find_in_corpus,               # submission_number → DeviceEvidenceProfile | None
    search_corpus,                # text → list[DeviceEvidenceProfile]
    corpus_stats,                 # profile → dict (corpus-level context percentages)

    # Classify (primary V4 variables)
    classify_evaluability_class,           # profile → str
    classify_recoverability,               # profile → str
    classify_claim_ceiling,                # profile → str
    classify_supportable_claims,           # profile → list[str]
    classify_audit_burden,                 # profile → dict
    estimate_authorization_remeasurement,  # profile → dict

    # Retrieve precedents
    retrieve_precedents,          # (profile, mode, k) → list[dict]
    build_bm25_index,
    structured_similarity,
    schema_similarity,
    explain_precedent_match,

    # Generate operational outputs
    generate_claim_support_matrix,
    generate_dashboard_claim_limits,
    generate_minimum_audit_dataset,
    generate_manufacturer_design_requirements,
    generate_procurement_questions,

    # Assemble complete reports
    generate_monitoring_package,          # (profile, mode, k) → dict
    generate_monitoring_profile_report,   # (profile, mode, k) → str (Markdown)
)

Design Principles

Zero runtime dependencies — the core package uses only the Python standard library, including a dependency-free BM25 implementation. Gradio and python-docx are optional extras.

Empirically grounded — every classification rule mirrors the pre-registered V4 codebook used to extract and code 1,400 public FDA authorization summaries. Classifications for new devices follow the same logic as the published audit.

Conservative — the codebook errs on the side of requiring more evidence work rather than overstating what routine data supports. proxy_only is the conservative default for recoverability; surrogate_only is the conservative default for evaluability.

Precedent-grounded — every output cites real FDA submission numbers verifiable at accessdata.fda.gov. The package cannot generate a recommendation not tied to a public precedent.

Schema-first retrieval — structured matching over shared coded fields (endpoint type, recoverability, ground truth, evaluability class) outperforms free-text search for this regulatory science task.

Disclaimer

This package does not determine whether a device is safe or effective and does not predict FDA decisions. It maps the evidentiary relationship between authorization claims, routine postmarket evidence, and supportable monitoring claims, grounded in public authorization precedents. All classifications are preliminary and generated from user-provided inputs under the study codebook (schema v4_claimbounded, pre-registered at doi:10.17605/OSF.IO/74WAP). Nothing in this package constitutes regulatory advice.

Citation

@software{claimbounded2026,
  title   = {claimbounded: Claim-Bounded Monitoring of AI-Enabled Medical Devices},
  author  = {Yanis Vandecasteele and Sofiane Vandecasteele},
  year    = {2026},
  url     = {https://github.com/yanisvdc/claimbounded},
  note    = {Schema version v4\_claimbounded. Grounded in 1,400 public FDA authorization
             records. OSF Preregistration: doi:10.17605/OSF.IO/74WAP}
}

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.1

Jun 27, 2026

0.2.0

Jun 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claimbounded-0.2.1.tar.gz (537.2 kB view details)

Uploaded Jun 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

claimbounded-0.2.1-py3-none-any.whl (538.2 kB view details)

Uploaded Jun 27, 2026 Python 3

File details

Details for the file claimbounded-0.2.1.tar.gz.

File metadata

Download URL: claimbounded-0.2.1.tar.gz
Upload date: Jun 27, 2026
Size: 537.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for claimbounded-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`ac7dc8ac7418a14c09ade90d2b896ef801d63464b3b031c4d4c696393e5e1167`
MD5	`96acab3c98adb348bdd23eada0618d78`
BLAKE2b-256	`3f4feb1513ca969f07d68b5791fd71f743240d5a8f485cdc0afc8d92577f76d1`

See more details on using hashes here.

File details

Details for the file claimbounded-0.2.1-py3-none-any.whl.

File metadata

Download URL: claimbounded-0.2.1-py3-none-any.whl
Upload date: Jun 27, 2026
Size: 538.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for claimbounded-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cd1ccd97723f6d240ef6f62b5de039a634bc292890396f69673006f3bfba25d9`
MD5	`0c93b942c12b1db83685741dcf3a0cb5`
BLAKE2b-256	`2950a903129789239ea8bbdc2dda3baf72404b7b6262413d1fab2431f8a6ddd2`

See more details on using hashes here.

claimbounded 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

title: claimbounded emoji: 🏥 colorFrom: blue colorTo: indigo sdk: gradio sdk_version: "6.19.0" app_file: app.py pinned: false license: mit short_description: Claim-bounded monitoring of AI-enabled medical devices

claimbounded

Try it now — no install required

Who Is This For?

Installation

Quick Start

Launch the interactive UI

Python API

CLI

The Five Primary Variables

Postmarket evaluability class

Authorization endpoint recoverability

The Claim Hierarchy

Precedent Retrieval

Interactive UI

① Profile & Report

② Corpus Search

③ Submission Lookup

Validation

Public API Reference

Design Principles

Disclaimer

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes