Standalone Visual-RAG PDF Parser — text extraction + Vision-LLM figure descriptions → JSONL
Reason this release was yanked:
Starting with 2.0.0, visual-parser is distributed under Apache License 2.0.
Project description
visual-parser (Standalone Visual-RAG PDF Ingestion)
visual-parser is a standalone document-ingestion tool that converts PDFs into a multi-modal JSONL knowledge base (text chunks + figure descriptions + metadata). The intended workflow is:
- Run
visual-parseron curated PDFs to generate JSONL KB files. - Run RADIANT-LLM Visual-RAG for QA over the generated KB.
Outputs (JSONL KB)
By default, the pipeline writes:
01_chunks_kb.jsonl: chunked text extracted from PDFs (Nougat by default).02_visuals_kb.jsonl: figure/page visual descriptions (Vision LLM).03_metadata_kb.jsonl: document metadata rows (title/author/etc.).04_processed_pdfs.txt: a tracker so re-runs only process new PDFs (unless--rebuild).
API keys (.env)
Provide at least one provider:
OPENAI_API_KEY(OpenAI)GEMINI_API_KEY(Gemini)
Optional:
HF_TOKEN(if you use gated Hugging Face models)
Run with Docker (Docker Hub)
Prebuilt images are on zev94/radiant-llm under the visual-parser tags:
| Tag | Description |
|---|---|
visual-parser-1.0 |
Pinned release |
visual-parser-latest |
Latest visual-parser build |
1) Install Docker
- Docker Desktop (Windows/macOS) or Docker Engine (Linux)
2) Pull the image
docker pull zev94/radiant-llm:visual-parser-1.0
3) Run (input + output on the same mounted folder)
Windows PowerShell:
docker run --rm --env-file .env `
-v "C:\path\to\pdfs:/data" `
zev94/radiant-llm:visual-parser-1.0 `
--input-dir /data --output-dir /data
Linux / WSL:
docker run --rm --env-file .env \
-v "/path/to/pdfs:/data" \
zev94/radiant-llm:visual-parser-1.0 \
--input-dir /data --output-dir /data
4) Run (separate output directory)
Windows PowerShell:
docker run --rm --env-file .env `
-v "C:\path\to\pdfs:/data" `
-v "C:\path\to\out:/out" `
zev94/radiant-llm:visual-parser-1.0 `
--input-dir /data --output-dir /out
Offline install (legacy .tar)
docker load -i .\visual-parser_0.1.0.tar
docker images # use the tag printed by Docker
Model overrides (optional)
Default vision model is GPT-5.5 when using --vision-provider gpt. Override on the command line:
docker run --rm --env-file .env -v "C:\path\to\pdfs:/data" `
zev94/radiant-llm:visual-parser-1.0 `
--input-dir /data --output-dir /data --vision-model gpt-5.4
Common configuration flags
After pulling the image, run:
docker run --rm zev94/radiant-llm:visual-parser-1.0 --help
For copy-paste Docker examples (vision presets, text modes, workers, rebuild), see docker-usage-examples.md.
Paths:
--input-dir/-i(required)--output-dir/-o(default: same as input)
Text extraction:
--text-mode nougat|lightweight(default:nougat)--nougat-model facebook/nougat-small--chunk-size 500--chunk-overlap 100
Vision LLM:
--vision-provider gpt|gemini(default:gpt)--vision-model gpt-5.2(orgpt-4o,gemini-2.5-flash, etc.)--vision-detail low|high|auto--reasoning-effort none|low|medium|high|xhigh--metadata-pages 2
Performance / misc:
--max-workers 4--rebuild(reprocess everything; ignore04_processed_pdfs.txt)--log-level DEBUG|INFO|WARNING|ERROR
Citation
If you use RADIANT-LLM or the accompanying evaluation materials, please cite the preprint:
@article{ndum2026radiant,
title={RADIANT-LLM: an Agentic Retrieval Augmented Generation Framework for Reliable Decision Support in Safety-Critical Nuclear Engineering},
author={Ndum, Zavier Ndum and Tao, Jian and Ford, John and Yim, Mansung and Liu, Yang},
journal={arXiv preprint arXiv:2604.22755},
year={2026}
}
Preprint: https://arxiv.org/abs/2604.22755
License
This repository is currently proprietary and not licensed for public use, redistribution, or modification. Licensing terms will be updated after institutional review.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file visual_parser-1.0.1.tar.gz.
File metadata
- Download URL: visual_parser-1.0.1.tar.gz
- Upload date:
- Size: 30.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f3d8afc1a2b6cd30482b57c4cc45e1e191b4105cbbfb451071858763d2d17c8
|
|
| MD5 |
9fa7d3e76d65b4167c33dd4f3a004ff3
|
|
| BLAKE2b-256 |
903c25b351a91d482e64860386b227edf52e23ffb7745966b076feac0e356764
|
File details
Details for the file visual_parser-1.0.1-py3-none-any.whl.
File metadata
- Download URL: visual_parser-1.0.1-py3-none-any.whl
- Upload date:
- Size: 35.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad69115d708b4d907cf3828e00d343fb5ac8b50c9be5dcd1aba55bd03624e07e
|
|
| MD5 |
5e72030ab59cd1965695badbabc4bce0
|
|
| BLAKE2b-256 |
6a9a459b1544b035c3286b1441cc7433266756dd95ba350c1c777c69defaa881
|