Extract and summarize medical exam reports (X-rays, MRIs, ultrasounds, etc.) with AI precision

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

engtiagosilva

These details have not been verified by PyPI

Project description

parsemedicalexams

🏥 Extract and summarize medical exam reports from PDFs using Vision AI 📄

Features · Quick Start · Configuration · Output Format

Features

Vision-powered extraction — Uses Vision LLMs to read X-rays, MRIs, ultrasounds, endoscopies, and more directly from PDF scans
Self-consistency voting — Runs multiple extractions and votes on the best result for maximum reliability
Intelligent classification — Automatically categorizes exams (imaging, ultrasound, endoscopy, other) and standardizes naming
Clinical summarization — Preserves all findings, impressions, and recommendations while filtering noise
Markdown output with YAML frontmatter — Clean, structured files ready for Obsidian, static sites, or further processing
Smart caching — Persistent JSON caches avoid redundant API calls and allow manual overrides
Multi-era document handling — Frequency-based date voting correctly handles documents spanning multiple time periods

Quick Start

1. Install

pip install -e .

Requires Poppler for PDF processing:

macOS: brew install poppler

Ubuntu: apt-get install poppler-utils

2. Configure

cp .env.example .env

Edit .env with your settings:

OPENROUTER_API_KEY=your_api_key_here
INPUT_PATH=/path/to/your/exam/pdfs
OUTPUT_PATH=/path/to/output

3. Run

python main.py

How It Works

┌─────────────┐    ┌─────────────────┐    ┌────────────────┐    ┌──────────────┐    ┌────────────┐
│  PDF Input  │───▶│  Preprocessing  │───▶│ Vision LLM ×N  │───▶│ Standardize  │───▶│  Markdown  │
│             │    │  (grayscale,    │    │  + voting      │    │  + classify  │    │   Output   │
│             │    │   resize)       │    │                │    │              │    │            │
└─────────────┘    └─────────────────┘    └────────────────┘    └──────────────┘    └────────────┘

PDF → Images — Converts each page to grayscale, resizes, and enhances contrast
Document classification — Determines if the document is a medical exam before processing
Vision LLM transcription — Transcribes each page verbatim using function calling (runs N times for reliability)
Self-consistency voting — If transcriptions differ, LLM votes on the best result
Standardization — Classifies exam type and standardizes the name via LLM with caching
Summarization — Generates document-level clinical summaries preserving all findings

Configuration

Environment Variables

Variable	Description	Default
`OPENROUTER_API_KEY`	Your OpenRouter API key (get one here)	required
`INPUT_PATH`	Directory containing exam PDFs	required
`OUTPUT_PATH`	Where to write output files	required
`EXTRACT_MODEL_ID`	Vision model for extraction	`google/gemini-2.5-flash`
`SUMMARIZE_MODEL_ID`	Model for summarization	`google/gemini-2.5-flash`
`SELF_CONSISTENCY_MODEL_ID`	Model for voting	`google/gemini-2.5-flash`
`N_EXTRACTIONS`	Number of extraction runs for voting	`3`
`MAX_WORKERS`	Parallel workers for PDF processing	`1`
`INPUT_FILE_REGEX`	Regex pattern for input files	`.*\.pdf`

Using Profiles

Profiles let you save different input/output configurations for different use cases:

# Create a profile from template
cp profiles/_template.yaml profiles/myprofile.yaml

# Run with profile
python main.py --profile myprofile

# List available profiles
python main.py --list-profiles

Profile files (YAML or JSON) support path overrides and model configuration:

name: myprofile
input_path: /path/to/input
output_path: /path/to/output
input_file_regex: ".*\\.pdf"
model: google/gemini-2.5-flash  # Optional override
workers: 1                       # Optional override

CLI Options

Option	Description
`--profile`, `-p`	Profile name to use
`--list-profiles`	List available profiles
`--regenerate`	Regenerate markdown files from existing JSON data
`--reprocess-all`	Force reprocess all documents
`--document`, `-d`	Process only this document (filename or stem)
`--page`	Process only this page number (requires `--document`)
`--model`, `-m`	Override model ID
`--workers`, `-w`	Override worker count
`--pattern`	Override input file regex

Examples:

# Process all new PDFs
python main.py --profile tsilva

# Regenerate summaries from existing transcription files
python main.py --profile tsilva --regenerate

# Force reprocess all documents
python main.py --profile tsilva --reprocess-all

# Reprocess a specific document
python main.py -p tsilva -d exam_2024.pdf

# Reprocess a specific page within a document
python main.py -p tsilva -d exam_2024.pdf --page 2

Output Format

The parser generates structured markdown files with YAML frontmatter:

output/
├── {document}/
│   ├── {document}.pdf            # Source PDF copy
│   ├── {document}.001.jpg        # Page 1 image
│   ├── {document}.001.md         # Page 1 transcription + metadata
│   ├── {document}.002.jpg        # Page 2 image
│   ├── {document}.002.md         # Page 2 transcription + metadata
│   └── {document}.summary.md     # Document-level summary

Transcription File Structure

Each .md file contains YAML frontmatter with metadata followed by the verbatim transcription:

---
date: 2024-01-15
title: "Chest X-Ray PA and Lateral"
category: imaging
exam_name_raw: "RX TORAX PA Y LAT"
doctor: "Dr. Smith"
facility: "Hospital Central"
confidence: 0.95
page: 1
source: exam_2024.pdf
---

[Full verbatim transcription text here...]

Metadata Fields

Field	Description
`date`	Exam date (YYYY-MM-DD)
`title`	Standardized exam name (English)
`category`	Exam type: `imaging`, `ultrasound`, `endoscopy`, `other`
`exam_name_raw`	Exam name exactly as written in document
`doctor`	Physician name (if found)
`facility`	Healthcare facility name
`department`	Department within facility
`confidence`	Self-consistency confidence score (0.0-1.0)
`page`	Page number in source PDF
`source`	Source PDF filename

Architecture

parsemedicalexams/
├── main.py              # Pipeline orchestration, CLI handling
├── extraction.py        # Pydantic models, Vision LLM extraction, voting
├── standardization.py   # Exam type classification with JSON cache
├── summarization.py     # Document-level clinical summarization
├── config.py            # ExtractionConfig (.env) + ProfileConfig (profiles/)
├── utils.py             # Image preprocessing, logging, JSON utilities
├── prompts/             # Externalized LLM prompts as markdown
├── profiles/            # User-specific path configurations
└── config/cache/        # Persistent LLM response caches (user-editable)

Key Design Patterns

Two-phase processing: Classify document first, then transcribe all pages
Two-column naming: *_raw (exact from document) + *_standardized (LLM-mapped)
Persistent caching: LLM standardization results cached in config/cache/*.json
Editable caches: Manually override cached values to fix misclassifications
Profile inheritance: Profiles can inherit from .env with overrides
Frequency-based date voting: Handles multi-era documents (e.g., 2024 cover letter + 1997 records)

Requirements

Python 3.8+
Poppler for PDF processing
OpenRouter API key for Vision LLM access

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

engtiagosilva

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.3

Feb 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parsemedicalexams-0.1.3.tar.gz (2.0 MB view details)

Uploaded Feb 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

parsemedicalexams-0.1.3-py3-none-any.whl (29.2 kB view details)

Uploaded Feb 28, 2026 Python 3

File details

Details for the file parsemedicalexams-0.1.3.tar.gz.

File metadata

Download URL: parsemedicalexams-0.1.3.tar.gz
Upload date: Feb 28, 2026
Size: 2.0 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for parsemedicalexams-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`4db4224ddee16fd2c6b3d9a6946cd2b0b8b784fb52959047b4dc53b06eaed7f4`
MD5	`ac797b8cd3d34b74e86d755c65073c60`
BLAKE2b-256	`8b8d575e7f5a4c78919c9e98c8e85d78fb695ab2de47135dcfd5063628939e5a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for parsemedicalexams-0.1.3.tar.gz:

Publisher: release.yml on tsilva/parsemedicalexams

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: parsemedicalexams-0.1.3.tar.gz
- Subject digest: 4db4224ddee16fd2c6b3d9a6946cd2b0b8b784fb52959047b4dc53b06eaed7f4
- Sigstore transparency entry: 1005489854
- Sigstore integration time: Feb 28, 2026
Source repository:
- Permalink: tsilva/parsemedicalexams@e810c5b36a35f018f9dbc0351d7732fa936ef2a1
- Branch / Tag: refs/heads/main
- Owner: https://github.com/tsilva
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@e810c5b36a35f018f9dbc0351d7732fa936ef2a1
- Trigger Event: push

File details

Details for the file parsemedicalexams-0.1.3-py3-none-any.whl.

File metadata

Download URL: parsemedicalexams-0.1.3-py3-none-any.whl
Upload date: Feb 28, 2026
Size: 29.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for parsemedicalexams-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d373630ac05a311d6a799c3b5eccb35380e8ec636b25be7d386b86396e8858d8`
MD5	`21e2887284dcb2c13c302c2ac3cc1761`
BLAKE2b-256	`88e8054c8c444984587d95cb118b62ce90b18c14595053d46be4517f9e6f4c5a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for parsemedicalexams-0.1.3-py3-none-any.whl:

Publisher: release.yml on tsilva/parsemedicalexams

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: parsemedicalexams-0.1.3-py3-none-any.whl
- Subject digest: d373630ac05a311d6a799c3b5eccb35380e8ec636b25be7d386b86396e8858d8
- Sigstore transparency entry: 1005489855
- Sigstore integration time: Feb 28, 2026
Source repository:
- Permalink: tsilva/parsemedicalexams@e810c5b36a35f018f9dbc0351d7732fa936ef2a1
- Branch / Tag: refs/heads/main
- Owner: https://github.com/tsilva
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@e810c5b36a35f018f9dbc0351d7732fa936ef2a1
- Trigger Event: push

parsemedicalexams 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

parsemedicalexams

Features

Quick Start

1. Install

2. Configure

3. Run

How It Works

Configuration

Environment Variables

Using Profiles

CLI Options

Output Format

Transcription File Structure

Metadata Fields

Architecture

Key Design Patterns

Requirements

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance