3 projects
pdfmux
Self-healing PDF extraction for RAG. Per-page confidence scoring, auto re-extracts bad pages, MCP server, LangChain/LlamaIndex loaders. LlamaParse alternative, #2 on opendataloader-bench.
llama-index-readers-pdfmux
LlamaIndex reader for pdfmux -- self-healing PDF extraction for RAG pipelines
langchain-pdfmux
LangChain document loader for pdfmux — self-healing PDF extraction