Semantic Consistency-Based Uncertainty Quantification for Radiology Report Generation
Project description
SCUQ-RRG
Code for the NAACL 2025 paper "Semantic Consistency-Based Uncertainty Quantification for Factuality in Radiology Report Generation".
Install
pip install scuq-rrg
For full functionality (GREEN model and RadGraph):
git clone --recurse-submodules https://github.com/Heimerd1nger/SCUQ-RRG.git
cd SCUQ-RRG
pip install -e .
pip install -e third_party/GREEN/ # green_score (PyPI version has incompatible API)
pip install radgraph # sentence-level UQ
Usage
Report-Level Uncertainty (VRO-GREEN)
Measures report-level factual uncertainty by comparing a greedy-decoded report against multiple sampled reports using the GREEN metric.
from scuq import ReportUncertaintyScorer
scorer = ReportUncertaintyScorer(
model_id_or_path="StanfordAIMI/GREEN-radllama2-7b",
cuda=True,
)
# greedy_report: the reference (greedy-decoded) report
# sampled_reports: list of 10 stochastically sampled reports
greedy_report = "The lungs are clear. No pleural effusion. Cardiomediastinal silhouette is normal."
sampled_reports = [
"Lungs are clear bilaterally. No effusion or pneumothorax.",
"Clear lungs. Heart size normal. No acute findings.",
# ... (typically 10 samples)
]
result = scorer.score(greedy_report, sampled_reports)
print(f"Uncertainty: {result.uncertainty:.3f}") # e.g. 0.596
print(f"Mean GREEN: {result.mean_green:.3f}") # e.g. 0.404
Sentence-Level Uncertainty (VRO-RadGraph)
Identifies the most uncertain sentence in a report using RadGraph entity consistency.
from scuq import SentenceUncertaintyScorer
scorer = SentenceUncertaintyScorer()
greedy_report = (
"No pneumothorax. "
"Possible left lower lobe opacity suggesting pneumonia. "
"Mild cardiomegaly. "
"No pleural effusion. "
"Stable appearance compared to prior. "
"No acute osseous abnormality."
)
sampled_reports = [
"No pneumothorax or effusion. Heart size normal.",
"Bilateral lungs clear. No acute findings.",
# ...
]
result = scorer.score(greedy_report, sampled_reports)
# Per-sentence uncertainty scores (0 = certain, 1 = uncertain):
# [0.05, 0.60, 0.80, 0.40, 0.28, 0.10]
print(f"Most uncertain: '{result.flagged_sentence}'")
print(f"Sentence scores: {[round(s, 2) for s in result.uncertainty_scores]}")
Data Format
Experiments expect:
greedy_reports: list of N strings (greedy-decoded reports)sampled_reports: list of N lists, each with 10 sampled strings
See example/example_data.ipynb for the exact pickle/CSV format used in experiments.
Demos
example/VRO_GREEN_demo.ipynb— report-level UQ walkthroughexample/VRO_Rad_demo.ipynb— sentence-level UQ walkthroughexample/quickstart.ipynb— end-to-end quickstart
Running Experiments
Report Scores
python -m src.uq.VroGreen \
--exp_name chexpert-plus \
--chexpert_file_path data/batch_chexpert_mimix_cxr_num3858.pkl \
--output_base_path results \
--num_samples 3858 --batch_size 16
Sentence UQ
python -m src.uq.VroRadSent \
--exp CheXpertPlus_mimiccxr \
--chexpert_file data/batch_chexpert_mimix_cxr_num3858.pkl \
--num_samples 3858 --output_dir results/exp_result
Abstention
python src/abstention/report_abstention.py \
--exp ChexpertPlus \
--green_scores_path data/green_scores-3858.pkl \
--green_uncertainty_path results/chexpert-plus/green_uncertainty-3858.csv \
--u_lexicalsim_path data/uq/lexicalUQ.csv \
--output_base_path results
Calibration (RCE)
python src/misc/cal_rce.py \
--scores_path data/green_scores-3858.pkl \
--green_uncertainty_path results/chexpert-plus/green_uncertainty-3858.pkl \
--u_nll_path data/uq/u_nll.csv \
--u_lexicalsim_path data/uq/lexicalUQ.csv
Citation
@inproceedings{wang2025semantic,
title={Semantic Consistency-Based Uncertainty Quantification for Factuality in Radiology Report Generation},
author={Wang, Chenyu and Bhatt, Parth and Shrivastava, Harshit and Bittencourt, Lucas and Kalra, Mannudeep K. and Gichoya, Judy W. and Celi, Leo Anthony and Peng, Yuyin and Patel, Bhavik N. and Trivedi, Hari},
booktitle={Proceedings of the 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year={2025}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scuq_rrg-1.0.0.tar.gz.
File metadata
- Download URL: scuq_rrg-1.0.0.tar.gz
- Upload date:
- Size: 15.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b635b8980927293f9abb037f3ab9183a7092ea2568f569fdbd39649d5a926ba
|
|
| MD5 |
d29228de0420335da3a5e64e6285d4b7
|
|
| BLAKE2b-256 |
5149e2201bfb96e85a7214aa582beee79470fc17c76e5dc155e724d3efeb38b5
|
Provenance
The following attestation bundles were made for scuq_rrg-1.0.0.tar.gz:
Publisher:
publish.yml on Heimerd1nger/SCUQ-RRG
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scuq_rrg-1.0.0.tar.gz -
Subject digest:
3b635b8980927293f9abb037f3ab9183a7092ea2568f569fdbd39649d5a926ba - Sigstore transparency entry: 1202246375
- Sigstore integration time:
-
Permalink:
Heimerd1nger/SCUQ-RRG@ae207485a15a3ce6fef82bd16b667af5c7b6a485 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/Heimerd1nger
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ae207485a15a3ce6fef82bd16b667af5c7b6a485 -
Trigger Event:
push
-
Statement type:
File details
Details for the file scuq_rrg-1.0.0-py3-none-any.whl.
File metadata
- Download URL: scuq_rrg-1.0.0-py3-none-any.whl
- Upload date:
- Size: 15.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50eb76b7298cffc38183541b5f3f7d4d8ca05beff23b23bba2373b1255f91298
|
|
| MD5 |
44aa2faa3e8cb14bd9c4261f597759a3
|
|
| BLAKE2b-256 |
4e728efa67d2258e1cdedd641b5765338c4f81183b248ddb774178a71fd701f8
|
Provenance
The following attestation bundles were made for scuq_rrg-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on Heimerd1nger/SCUQ-RRG
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scuq_rrg-1.0.0-py3-none-any.whl -
Subject digest:
50eb76b7298cffc38183541b5f3f7d4d8ca05beff23b23bba2373b1255f91298 - Sigstore transparency entry: 1202246678
- Sigstore integration time:
-
Permalink:
Heimerd1nger/SCUQ-RRG@ae207485a15a3ce6fef82bd16b667af5c7b6a485 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/Heimerd1nger
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ae207485a15a3ce6fef82bd16b667af5c7b6a485 -
Trigger Event:
push
-
Statement type: