An SDK for intelligent document processing using SOTA VLLM models
Project description
Docuglean OCR - Python SDK
A unified Python SDK for intelligent document processing using State of the Art AI models.
Features
- 🚀 Easy to Use: Simple, intuitive API with detailed documentation
- 🔍 OCR Capabilities: Extract text from images and scanned documents
- 📊 Structured Data Extraction: Use Pydantic models for type-safe data extraction
- 📄 Multimodal Support: Process PDFs and images with ease
- 🤖 Multiple AI Providers: Support for OpenAI, Mistral, Google Gemini, and Hugging Face
- 🔒 Type Safety: Full Python type hints with Pydantic validation
Installation
pip install docuglean-ocr
Quick Start
OCR Processing
from docuglean import ocr
# Mistral OCR
result = await ocr(
file_path="./document.pdf",
provider="mistral",
model="mistral-ocr-latest",
api_key="your-api-key"
)
# Google Gemini OCR
result = await ocr(
file_path="./document.pdf",
provider="gemini",
model="gemini-2.5-flash",
api_key="your-gemini-api-key",
prompt="Extract all text from this document"
)
# Hugging Face OCR (no API key needed)
result = await ocr(
file_path="https://example.com/image.jpg", # Supports URLs, local files, base64
provider="huggingface",
model="Qwen/Qwen2.5-VL-3B-Instruct",
prompt="Extract all text from this image"
)
Structured Data Extraction
from docuglean import extract
from pydantic import BaseModel
from typing import List
class ReceiptItem(BaseModel):
name: str
price: float
class Receipt(BaseModel):
date: str
total: float
items: List[ReceiptItem]
# Extract structured data with OpenAI
receipt = await extract(
file_path="./receipt.pdf",
provider="openai",
api_key="your-api-key",
response_format=Receipt,
prompt="Extract receipt information"
)
# Extract structured data with Gemini
receipt = await extract(
file_path="./receipt.pdf",
provider="gemini",
api_key="your-gemini-api-key",
response_format=Receipt,
prompt="Extract receipt information including date, total, and all items"
)
Development
Setup
# Install with UV
uv sync
Testing
# Run all tests
uv run pytest tests/ -v
# Run specific test files
uv run pytest tests/test_basic.py -v # Basic tests only
uv run pytest tests/test_ocr.py tests/test_extract.py -v # Mistral tests (requires MISTRAL_API_KEY)
uv run pytest tests/test_openai.py -v # OpenAI tests (requires OPENAI_API_KEY)
# Run with output (shows print statements)
uv run pytest tests/ -v -s
# Run specific test function
uv run pytest tests/test_openai.py::test_openai_extract_unstructured_pdf -v -s
# Set API keys for real testing
export MISTRAL_API_KEY=your_mistral_key_here
export OPENAI_API_KEY=your_openai_key_here
export GEMINI_API_KEY=your_gemini_key_here
uv run pytest tests/ -v -s
Code Quality
# Run linting and type checking
uv run ruff check src/ tests/
# Fix linting issues automatically
uv run ruff check src/ tests/ --fix
# Format code
uv run ruff format src/ tests/
License
Apache 2.0 - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
docuglean_ocr-1.0.0.tar.gz
(93.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docuglean_ocr-1.0.0.tar.gz.
File metadata
- Download URL: docuglean_ocr-1.0.0.tar.gz
- Upload date:
- Size: 93.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b65d961c69b3e734151df7747448fe8344ed1dab1eee3288c75b52f3b47b8685
|
|
| MD5 |
9980e20590c331c08a7c6b30e324692d
|
|
| BLAKE2b-256 |
26b1c74ec358a51ab4da502672b784325050fba76846540fb325c6c2beb13672
|
File details
Details for the file docuglean_ocr-1.0.0-py3-none-any.whl.
File metadata
- Download URL: docuglean_ocr-1.0.0-py3-none-any.whl
- Upload date:
- Size: 17.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f37e8b4b608406a04cbf7d6fbb62aa5870770a62263caf60775977fe006033a
|
|
| MD5 |
0267f6759a7d97c2b7581a894f6e647a
|
|
| BLAKE2b-256 |
d50b5f73d512e2a52c674aaf1f964bbebc8b62c0a5500374fbdeae7d87480524
|