Skip to main content

Semantic Consistency-Based Uncertainty Quantification for Radiology Report Generation

Project description

SCUQ-RRG

arXiv Citations PyPI License: MIT

Code for the NAACL 2025 paper "Semantic Consistency-Based Uncertainty Quantification for Factuality in Radiology Report Generation".

Install

pip install scuq-rrg

For full functionality (GREEN model and RadGraph):

git clone --recurse-submodules https://github.com/Heimerd1nger/SCUQ-RRG.git
cd SCUQ-RRG
pip install -e .
pip install -e third_party/GREEN/   # green_score (PyPI version has incompatible API)
pip install radgraph                # sentence-level UQ

Usage

Report-Level Uncertainty (VRO-GREEN)

Measures report-level factual uncertainty by comparing a greedy-decoded report against multiple sampled reports using the GREEN metric.

from scuq import ReportUncertaintyScorer

scorer = ReportUncertaintyScorer(
    model_id_or_path="StanfordAIMI/GREEN-radllama2-7b",
    cuda=True,
)

# greedy_report: the reference (greedy-decoded) report
# sampled_reports: list of 10 stochastically sampled reports
greedy_report = "The lungs are clear. No pleural effusion. Cardiomediastinal silhouette is normal."
sampled_reports = [
    "Lungs are clear bilaterally. No effusion or pneumothorax.",
    "Clear lungs. Heart size normal. No acute findings.",
    # ... (typically 10 samples)
]

result = scorer.score(greedy_report, sampled_reports)
print(f"Uncertainty: {result.uncertainty:.3f}")   # e.g. 0.596
print(f"Mean GREEN:  {result.mean_green:.3f}")    # e.g. 0.404

Sentence-Level Uncertainty (VRO-RadGraph)

Identifies the most uncertain sentence in a report using RadGraph entity consistency.

from scuq import SentenceUncertaintyScorer

scorer = SentenceUncertaintyScorer()

greedy_report = (
    "No pneumothorax. "
    "Possible left lower lobe opacity suggesting pneumonia. "
    "Mild cardiomegaly. "
    "No pleural effusion. "
    "Stable appearance compared to prior. "
    "No acute osseous abnormality."
)
sampled_reports = [
    "No pneumothorax or effusion. Heart size normal.",
    "Bilateral lungs clear. No acute findings.",
    # ...
]

result = scorer.score(greedy_report, sampled_reports)
# Per-sentence uncertainty scores (0 = certain, 1 = uncertain):
# [0.05, 0.60, 0.80, 0.40, 0.28, 0.10]
print(f"Most uncertain: '{result.flagged_sentence}'")
print(f"Sentence scores: {[round(s, 2) for s in result.uncertainty_scores]}")

Data Format

Experiments expect:

  • greedy_reports: list of N strings (greedy-decoded reports)
  • sampled_reports: list of N lists, each with 10 sampled strings

Citation

@inproceedings{wang2025semantic,
  title={Semantic Consistency-Based Uncertainty Quantification for Factuality in Radiology Report Generation},
  author={Wang, Chenyu and Bhatt, Parth and Shrivastava, Harshit and Bittencourt, Lucas and Kalra, Mannudeep K. and Gichoya, Judy W. and Celi, Leo Anthony and Peng, Yuyin and Patel, Bhavik N. and Trivedi, Hari},
  booktitle={Proceedings of the 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  year={2025}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scuq_rrg-1.0.1.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scuq_rrg-1.0.1-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file scuq_rrg-1.0.1.tar.gz.

File metadata

  • Download URL: scuq_rrg-1.0.1.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scuq_rrg-1.0.1.tar.gz
Algorithm Hash digest
SHA256 e8ab5f14d44110b45c8fe5832cc0b6efbdddf2dd785de65d4ec964f90322a61d
MD5 abd24421d2ac5cea6377a3ecf26ecf8d
BLAKE2b-256 039bbb7a4f488d5aa4f803f77d7eb17a51b4423187bab56ba22195ec98e6f470

See more details on using hashes here.

Provenance

The following attestation bundles were made for scuq_rrg-1.0.1.tar.gz:

Publisher: publish.yml on Heimerd1nger/SCUQ-RRG

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scuq_rrg-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: scuq_rrg-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scuq_rrg-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a05f23d4f8b340a79f544b44527a128b0373b53e1ac5d93ae26d8d702cdf1389
MD5 c01cfa9d5f630967efeda2c8eb194cbc
BLAKE2b-256 7cc11878bce36663a57626fae65c54a69597f0c134f91bb8eb3e4c48ffe7105c

See more details on using hashes here.

Provenance

The following attestation bundles were made for scuq_rrg-1.0.1-py3-none-any.whl:

Publisher: publish.yml on Heimerd1nger/SCUQ-RRG

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page