Skip to main content

Semantic Consistency-Based Uncertainty Quantification for Radiology Report Generation

Project description

SCUQ-RRG

arXiv Citations PyPI License: MIT

Code for the NAACL 2025 paper "Semantic Consistency-Based Uncertainty Quantification for Factuality in Radiology Report Generation".

Install

pip install scuq-rrg

For full functionality (GREEN model and RadGraph):

git clone --recurse-submodules https://github.com/Heimerd1nger/SCUQ-RRG.git
cd SCUQ-RRG
pip install -e .
pip install -e third_party/GREEN/   # green_score (PyPI version has incompatible API)
pip install radgraph                # sentence-level UQ

Usage

Report-Level Uncertainty (VRO-GREEN)

Measures report-level factual uncertainty by comparing a greedy-decoded report against multiple sampled reports using the GREEN metric.

from scuq import ReportUncertaintyScorer

scorer = ReportUncertaintyScorer(
    model_id_or_path="StanfordAIMI/GREEN-radllama2-7b",
    cuda=True,
)

# greedy_report: the reference (greedy-decoded) report
# sampled_reports: list of 10 stochastically sampled reports
greedy_report = "The lungs are clear. No pleural effusion. Cardiomediastinal silhouette is normal."
sampled_reports = [
    "Lungs are clear bilaterally. No effusion or pneumothorax.",
    "Clear lungs. Heart size normal. No acute findings.",
    # ... (typically 10 samples)
]

result = scorer.score(greedy_report, sampled_reports)
print(f"Uncertainty: {result.uncertainty:.3f}")   # e.g. 0.596
print(f"Mean GREEN:  {result.mean_green:.3f}")    # e.g. 0.404

Sentence-Level Uncertainty (VRO-RadGraph)

Identifies the most uncertain sentence in a report using RadGraph entity consistency.

from scuq import SentenceUncertaintyScorer

scorer = SentenceUncertaintyScorer()

greedy_report = (
    "No pneumothorax. "
    "Possible left lower lobe opacity suggesting pneumonia. "
    "Mild cardiomegaly. "
    "No pleural effusion. "
    "Stable appearance compared to prior. "
    "No acute osseous abnormality."
)
sampled_reports = [
    "No pneumothorax or effusion. Heart size normal.",
    "Bilateral lungs clear. No acute findings.",
    # ...
]

result = scorer.score(greedy_report, sampled_reports)
# Per-sentence uncertainty scores (0 = certain, 1 = uncertain):
# [0.05, 0.60, 0.80, 0.40, 0.28, 0.10]
print(f"Most uncertain: '{result.flagged_sentence}'")
print(f"Sentence scores: {[round(s, 2) for s in result.uncertainty_scores]}")

Data Format

Experiments expect:

  • greedy_reports: list of N strings (greedy-decoded reports)
  • sampled_reports: list of N lists, each with 10 sampled strings

Citation

@inproceedings{wang2025semantic,
  title={Semantic consistency-based uncertainty quantification for factuality in radiology report generation},
  author={Wang, Chenyu and Zhou, Weichao and Ghosh, Shantanu and Batmanghelich, Kayhan and Li, Wenchao},
  booktitle={Findings of the Association for Computational Linguistics: NAACL 2025},
  pages={1739--1754},
  year={2025}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scuq_rrg-1.0.2.tar.gz (14.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scuq_rrg-1.0.2-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file scuq_rrg-1.0.2.tar.gz.

File metadata

  • Download URL: scuq_rrg-1.0.2.tar.gz
  • Upload date:
  • Size: 14.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scuq_rrg-1.0.2.tar.gz
Algorithm Hash digest
SHA256 09e841f4be412bfe0203055b35459f3cde201625584aa147cd6691ae6986552c
MD5 54d3425354b8ef24e68eb1698bd0744e
BLAKE2b-256 cebe95b2a085e720ada291cc23490aba77d18cd21929cbf9a3be8fe452777dc9

See more details on using hashes here.

Provenance

The following attestation bundles were made for scuq_rrg-1.0.2.tar.gz:

Publisher: publish.yml on Heimerd1nger/SCUQ-RRG

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scuq_rrg-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: scuq_rrg-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scuq_rrg-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a74f47b7de35e32b7c286d63db479965305a2a4424f906f8d51ddfa627b7268d
MD5 f72cd4f2ffca9f3e16079f4278797f20
BLAKE2b-256 9bc6a6a2ae477b8889d03ad802fd92518a5c8eaad8f5a15ba3236451f1ba0c0d

See more details on using hashes here.

Provenance

The following attestation bundles were made for scuq_rrg-1.0.2-py3-none-any.whl:

Publisher: publish.yml on Heimerd1nger/SCUQ-RRG

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page