8 projects
datalab-python-sdk
SDK for the Datalab document intelligence API
chandra-ocr
OCR model that converts documents to markdown, HTML, or JSON.
marker-pdf
Convert documents to markdown with high speed and accuracy.
surya-ocr
OCR, layout, reading order, and table recognition in 90+ languages
pdftext
Extract structured text from pdfs quickly
tabled-pdf
Detect and recognize tables in PDFs and images.
texify
OCR for latex images
streamlit-drawable-canvas-jsretry
A Streamlit custom component for a free drawing canvas using Fabric.js. A fork to enable retrying for bg images.