Synchronous OCR using Gemini Vision API - A rewrite of pyzerox without async/litellm

These details have not been verified by PyPI

Project links

Project description

Zerox Sync

A synchronous Python library for OCR and document extraction using Google's Gemini Vision API. This is a rewrite of pyzerox that removes async wrappers and replaces litellm with direct Gemini API integration.

Features

Synchronous API: No async/await complexity, simple function calls
Direct Gemini Integration: Uses Google's Gemini API directly without litellm dependency
PDF to Markdown: Convert PDFs to structured markdown using vision models
Concurrent Processing: Process multiple pages in parallel using ThreadPoolExecutor
Selective Page Processing: Extract specific pages from PDFs
Format Consistency: Maintain formatting across pages
Simple Setup: Just set GOOGLE_API_KEY and go

Installation

Using uv (Recommended)

uv is a fast Python package installer:

# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install zerox-sync
uv pip install zerox-sync

Using pip

pip install zerox-sync

System Dependencies

You'll need poppler installed for PDF processing:

macOS:

brew install poppler

Ubuntu/Debian:

sudo apt-get install poppler-utils

Windows: Download and install from poppler releases

Quick Start

from zerox_sync import zerox
import os

# Set your Gemini API key
os.environ["GOOGLE_API_KEY"] = "your-api-key-here"

# Process a PDF
result = zerox(
    file_path="path/to/document.pdf",
    model="gemini-3-pro",
)

# Access the results
for page in result.pages:
    print(f"Page {page.page}:")
    print(page.content)
    print(f"Length: {page.content_length} chars\n")

print(f"Total time: {result.completion_time}ms")
print(f"Input tokens: {result.input_tokens}")
print(f"Output tokens: {result.output_tokens}")

API Reference

`zerox()`

Main function to perform OCR on a PDF document.

def zerox(
    cleanup: bool = True,
    concurrency: int = 10,
    file_path: str = "",
    image_density: int = 300,
    image_height: tuple = (None, 1056),
    maintain_format: bool = False,
    model: str = "gemini-3-pro",
    output_dir: Optional[str] = None,
    temp_dir: Optional[str] = None,
    custom_system_prompt: Optional[str] = None,
    select_pages: Optional[Union[int, List[int]]] = None,
    **kwargs
) -> ZeroxOutput:

Parameters:

cleanup (bool): Whether to cleanup temporary files after processing (default: True)
concurrency (int): Number of concurrent threads for page processing (default: 10)
file_path (str): Path or URL to the PDF file
image_density (int): DPI for PDF to image conversion (default: 300)
image_height (tuple): Image dimensions as (width, height) (default: (None, 1056))
maintain_format (bool): Maintain consistent formatting across pages (default: False)
model (str): Gemini model to use (default: "gemini-3-pro")
output_dir (Optional[str]): Directory to save markdown output (default: None)
temp_dir (Optional[str]): Directory for temporary files (default: system temp)
custom_system_prompt (Optional[str]): Override default system prompt (default: None)
select_pages (Optional[Union[int, List[int]]]): Specific pages to process (default: None)
**kwargs: Additional arguments passed to Gemini API

Returns:

ZeroxOutput object with:

completion_time (float): Processing time in milliseconds
file_name (str): Processed file name
input_tokens (int): Number of input tokens used
output_tokens (int): Number of output tokens generated
pages (List[Page]): List of Page objects containing:
- content (str): Markdown content
- page (int): Page number
- content_length (int): Content length in characters

Advanced Usage

Process Specific Pages

result = zerox(
    file_path="document.pdf",
    select_pages=[1, 3, 5],  # Only process pages 1, 3, and 5
)

Maintain Format Consistency

result = zerox(
    file_path="document.pdf",
    maintain_format=True,  # Process pages sequentially to maintain formatting
)

Save to File

result = zerox(
    file_path="document.pdf",
    output_dir="./output",  # Markdown saved to ./output/{filename}.md
)

Custom System Prompt

result = zerox(
    file_path="document.pdf",
    custom_system_prompt="Extract only tables from this document in markdown format.",
)

Process from URL

result = zerox(
    file_path="https://example.com/document.pdf",
)

Adjust Concurrency

result = zerox(
    file_path="document.pdf",
    concurrency=5,  # Process 5 pages concurrently (default: 10)
)

Available Models

Zerox Sync supports various Gemini models:

gemini-3-pro (default): Most intelligent model
gemini-3-flash-preview: Fast with frontier-class performance
gemini-2.5-pro: Powerful reasoning model
gemini-2.5-flash: Balanced model with 1M token context
gemini-2.5-flash-lite: Fastest and most cost-efficient

Environment Variables

GOOGLE_API_KEY: Your Google AI Studio API key (required)
- Get your key from: https://aistudio.google.com/apikey

Differences from pyzerox

Synchronous: No async/await - uses standard function calls
Gemini Direct: Direct Gemini API integration instead of litellm
Simple Dependencies: Fewer dependencies, no aiofiles/aiohttp/aioshutil
ThreadPoolExecutor: Uses standard library threading instead of asyncio
Requests: Uses requests library for HTTP instead of aiohttp

Error Handling

from zerox_sync import zerox
from zerox_sync.errors import (
    FileUnavailable,
    MissingEnvironmentVariables,
    ResourceUnreachableException,
    PageNumberOutOfBoundError,
)

try:
    result = zerox(file_path="document.pdf")
except MissingEnvironmentVariables:
    print("Please set GOOGLE_API_KEY environment variable")
except FileUnavailable:
    print("File not found or invalid path")
except ResourceUnreachableException:
    print("Could not download file from URL")
except PageNumberOutOfBoundError:
    print("Invalid page numbers specified")

Development

Setup

# Clone the repository
git clone https://github.com/yourusername/zerox-sync.git
cd zerox-sync

# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install with dev dependencies
uv pip install -e ".[dev]"

Running Tests

pytest

Code Formatting

# Format code
black zerox_sync tests

# Lint
ruff check zerox_sync tests

License

MIT License - see LICENSE file for details

Credits

This project is a synchronous rewrite of pyzerox by the getomni-ai team. The original project is an excellent async implementation with litellm support.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Dec 28, 2025

0.1.5

Dec 28, 2025

0.1.4

Dec 28, 2025

0.1.3

Dec 28, 2025

0.1.1

Dec 26, 2025

This version

0.1.0

Dec 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zerox_sync-0.1.0.tar.gz (23.3 kB view details)

Uploaded Dec 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

zerox_sync-0.1.0-py3-none-any.whl (5.0 kB view details)

Uploaded Dec 26, 2025 Python 3

File details

Details for the file zerox_sync-0.1.0.tar.gz.

File metadata

Download URL: zerox_sync-0.1.0.tar.gz
Upload date: Dec 26, 2025
Size: 23.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for zerox_sync-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`220f110be389952411616701cf3b4f7921cbd9b2418cfb09050569ef0063ce0b`
MD5	`eb69334922a98bc9388dc72c6bba8e92`
BLAKE2b-256	`28fdf08ba8afab165ffbb3f46ad9d0aab365a65a200cc4e60b5982cd2b2ff02f`

See more details on using hashes here.

File details

Details for the file zerox_sync-0.1.0-py3-none-any.whl.

File metadata

Download URL: zerox_sync-0.1.0-py3-none-any.whl
Upload date: Dec 26, 2025
Size: 5.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for zerox_sync-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c93ccf8e747c413d3c7ecb2a5ff5ee8882881997761a76afd045a89d56d01ae6`
MD5	`ea46e3c9a628103cbaa91e0ecd877d20`
BLAKE2b-256	`9f5f976b1dd3854ebee8fec9f6115de99f339e0410875366c534d9c16c4dced8`

See more details on using hashes here.

zerox-sync 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Zerox Sync

Features

Installation

Using uv (Recommended)

Using pip

System Dependencies

Quick Start

API Reference

zerox()

Advanced Usage

Process Specific Pages

Maintain Format Consistency

Save to File

Custom System Prompt

Process from URL

Adjust Concurrency

Available Models

Environment Variables

Differences from pyzerox

Error Handling

Development

Setup

Running Tests

Code Formatting

License

Credits

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`zerox()`