Parse, structure, and export radiology free-text reports to FHIR
Project description
radreport-parser
Parse radiology free-text reports into structured data. No ML. No GPU. No dependencies.
Radiology reports come out as free-text PDFs. Downstream systems — EMRs, telehealth portals, billing platforms, research pipelines — need structured data. This library bridges that gap.
Three things it does well:
- Parse — splits any free-text report into labeled sections, extracts measurements, links findings to anatomy
- Detect — flags critical/urgent findings with negation awareness (no false alerts for "no pneumothorax")
- Export — outputs FHIR R4 DiagnosticReport resources ready for any EMR
Install
pip install radreport-parser
Zero required dependencies. Works on Python 3.9+.
Quick Start
from radreport_parser import ReportParser, CriticalFindingsDetector, FHIRExporter
import json
report_text = """
INDICATION: Chest pain, rule out PE.
FINDINGS:
Lungs: Filling defect in the right main pulmonary artery consistent with
pulmonary embolism. No pneumothorax.
IMPRESSION:
Pulmonary embolism, right main pulmonary artery. Urgent correlation recommended.
"""
# 1. Parse
parser = ReportParser()
report = parser.parse(report_text, modality="CT")
print(report.impression)
# → "Pulmonary embolism, right main pulmonary artery. Urgent correlation recommended."
# 2. Detect critical findings
detector = CriticalFindingsDetector()
report = detector.detect(report)
for cf in report.critical_findings:
if not cf.negated:
print(f"[{cf.severity.upper()}] {cf.term} ({cf.category})")
print(f" Context: {cf.context}")
# → [CRITICAL] pulmonary embolism (pulmonary)
# Context: Filling defect in the right main pulmonary artery consistent with pulmonary embolism.
# 3. Export to FHIR
exporter = FHIRExporter()
fhir = exporter.export(report, patient_id="pt-001")
print(json.dumps(fhir, indent=2))
CLI
After installation, the radreport command is available for single-file and batch processing:
# Parse a single report to JSON
radreport report.txt
# Parse with critical findings detection
radreport report.txt --critical
# Export as FHIR DiagnosticReport
radreport report.txt --fhir --patient-id pt-001 --modality CT
# Batch process multiple files → JSON array
radreport reports/*.txt --critical -o batch.json
# Specify modality for all files
radreport *.txt --modality MRI --fhir -o fhir_batch.json
Flags:
| Flag | Short | Description |
|---|---|---|
--modality MOD |
-m |
CT, MRI, XR, US, NM, PET … |
--critical |
-c |
Run critical findings detection |
--fhir |
-f |
Export as FHIR R4 DiagnosticReport (implies --critical) |
--patient-id ID |
FHIR Patient resource ID | |
--output FILE |
-o |
Write output to file instead of stdout |
Parsing
Sections
The parser recognizes standard radiology report sections regardless of formatting style:
| Section key | Matched headers |
|---|---|
indication |
Indication, Clinical Indication, History, Reason for Exam |
technique |
Technique, Procedure, Protocol |
comparison |
Comparison, Prior Study, Previous |
findings |
Findings, Observations |
impression |
Impression, Conclusion, Assessment, Diagnosis |
recommendation |
Recommendation, Follow-up, Advised |
report = parser.parse(text, modality="MRI")
findings = report.get_section("findings")
print(findings.raw_text)
impression = report.get_section("impression")
print(impression.raw_text)
Measurements
All measurements are extracted and normalized to millimeters:
for m in report.all_measurements:
print(f" Raw: {m.raw}")
print(f" Normalized (mm): {m.dimensions_mm}")
print(f" Largest dimension: {m.largest_dimension_mm} mm")
# Raw: 2.3 x 1.8 cm
# Normalized (mm): [23.0, 18.0]
# Largest dimension: 23.0 mm
Handles: 1.2 x 0.8 cm, 12mm, 1.2cm, 12 x 8 x 5 mm, 1.2 x 0.8 x 0.5 cm
Findings by anatomy
findings_section = report.get_section("findings")
for finding in findings_section.findings:
print(f"Anatomy: {finding.anatomy or 'unspecified'}")
print(f"Text: {finding.text}")
Batch processing
reports = parser.parse_batch(list_of_texts, modality="CT")
# Returns list[ParsedReport | None] — None for empty/unparseable inputs
active = [r for r in reports if r is not None]
JSON serialization
report = parser.parse(text, modality="CT")
# As dict
d = report.to_dict()
# As JSON string (shorthand)
json_str = report.to_json()
json_str = report.to_json(indent=4)
Critical Findings Detection
Rule-based. Fully auditable. No black boxes.
Covers 45+ terms across 8 categories:
| Category | Examples |
|---|---|
vascular |
aortic dissection, DVT, aortic aneurysm |
pulmonary |
pulmonary embolism, PE, pneumothorax, hemothorax |
neuro |
subdural hematoma, midline shift, intracranial hemorrhage |
abdominal |
free air, bowel perforation, appendicitis |
cardiac |
cardiac tamponade, pericardial effusion |
spinal |
cord compression, cervical fracture |
oncologic |
malignancy, metastasis, carcinoma |
Negation awareness
# "No pneumothorax identified" → negated=True, won't trigger alert
# "Pneumothorax present" → negated=False, triggers alert
active = [cf for cf in report.critical_findings if not cf.negated]
Severity levels
critical— requires immediate action (PE, subdural hematoma, pneumothorax)urgent— requires same-day follow-up (DVT, bowel obstruction, appendicitis)significant— requires follow-up (malignancy, metastasis)
Extending the term list
from radreport_parser.critical_findings import CRITICAL_TERMS
CRITICAL_TERMS["tension pneumothorax"] = ("pulmonary", "critical")
CRITICAL_TERMS["septic emboli"] = ("vascular", "urgent")
FHIR Export
Outputs a valid FHIR R4 DiagnosticReport resource.
from datetime import datetime
fhir = exporter.export(
report,
patient_id="pt-001", # Optional: links to FHIR Patient resource
report_id="rpt-20240315", # Optional: custom resource ID
issued_dt=datetime.now(), # Optional: defaults to UTC now
)
What's included
resourceType:DiagnosticReportstatus:finalcode: LOINC code matched to modality (CT, MRI, US, etc.)conclusion: impression textpresentedForm: full report text as base64 attachmentcontained: FHIR Observations for each active (non-negated) critical findingextension: structured sections for downstream parsingsubject: patient reference (whenpatient_idprovided)
Full Pipeline Example
import json
from radreport_parser import ReportParser, CriticalFindingsDetector, FHIRExporter
parser = ReportParser()
detector = CriticalFindingsDetector()
exporter = FHIRExporter()
def process_report(text: str, modality: str, patient_id: str) -> dict:
report = parser.parse(text, modality=modality)
report = detector.detect(report)
active_criticals = [cf for cf in report.critical_findings if not cf.negated]
if active_criticals:
print(f"WARNING: {len(active_criticals)} critical finding(s) detected")
return exporter.export(report, patient_id=patient_id)
fhir_json = process_report(report_text, modality="CT", patient_id="pt-001")
print(json.dumps(fhir_json, indent=2))
See full_pipeline.py for a runnable end-to-end example.
Design Principles
No dependencies. The library installs with no third-party packages. This matters in hospital environments where every dependency goes through security review.
Rule-based, not ML-based. Every decision the library makes is traceable to a specific rule. No model weights, no GPU, no probabilistic outputs. Clinical teams can audit exactly why a finding was flagged.
Negation-aware. A library that can't distinguish "no pneumothorax" from "pneumothorax" is dangerous in clinical contexts. Negation detection is built into the core.
FHIR-first output. Every modern EMR speaks FHIR. The export format is designed to drop into existing integrations without transformation.
Running Tests
pip install radreport-parser[dev]
pytest tests/ -v
Roadmap
- CLI tool for single-file and batch processing (
radreportcommand) -
parse_batch()API for processing lists of reports -
to_json()convenience method onParsedReport - Template matching for common report types (Chest XR, CT Abdomen, MRI Brain)
- Structured output for follow-up recommendations
- Additional FHIR resource types (ImagingStudy, Condition)
- CSV export mode for research/analytics workflows
Disclaimer
This library is a developer tool for structuring report text. It is not a medical device and is not intended for direct clinical decision-making. Critical findings detection is designed to assist human review workflows, not replace radiologist judgment.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file radreport_parser-0.3.0.tar.gz.
File metadata
- Download URL: radreport_parser-0.3.0.tar.gz
- Upload date:
- Size: 25.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
820a6f529aa4503cb68f7ed6a0a2833e606c6db77990a8330665f4c710329166
|
|
| MD5 |
d08f2ed7bcc387fa2f59a8eda5c5fe66
|
|
| BLAKE2b-256 |
ad7b96ff8038143730d50528d0024455fdbb145d06e3cf76896f58551254b360
|
File details
Details for the file radreport_parser-0.3.0-py3-none-any.whl.
File metadata
- Download URL: radreport_parser-0.3.0-py3-none-any.whl
- Upload date:
- Size: 20.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06d58f2b5b4cb97e0340f3bb1775b134ea43e84e0322d09ffbd4094ddc733824
|
|
| MD5 |
d2b9b11747d7a90e9fd20b68b1519e79
|
|
| BLAKE2b-256 |
76e7c564c87c4af6f766014a9e940f4ca4e8aa96b5f57741dcace62cfa902899
|