Skip to main content

Python SDK for TurboOCR — fast GPU OCR server (HTTP + gRPC)

Project description

turboocr

Typed Python client for the TurboOCR server. Sync + async, HTTP + gRPC, layout-aware Markdown rendering, searchable-PDF generation.

PyPI Python Typed License: MIT

Install

pip install turboocr             # HTTP client + CLI + searchable-PDF
pip install 'turboocr[grpc]'     # add the gRPC transport
pip install 'turboocr[all]'      # everything optional (currently == [grpc])

Requires Python 3.12+.

Quickstart

Start a TurboOCR server (the C++/CUDA OCR engine — this repo is just the Python client):

docker run --gpus all -p 8000:8000 -p 50051:50051 \
  -v trt-cache:/home/ocr/.cache/turbo-ocr \
  -e OCR_LANG=latin \
  ghcr.io/aiptimizer/turboocr:v2.2.3

OCR_LANG=latin (default) covers English, French, German, Spanish, …. Swap for chinese, greek, eslav, arabic, korean, or thai — all are baked in. See the TurboOCR repo for build-from-source, benchmarks, and the full set of server env vars.

Then recognise an image and turn a PDF into Markdown:

from turboocr import Client, render_to_markdown

with Client(base_url="http://localhost:8000") as client:
    # Image OCR
    img = client.recognize_image("page.png", layout=True, include_blocks=True)
    print(f"{len(img.results)} text items, {len(img.blocks)} blocks")
    print(img.text)

    # PDF → Markdown
    pdf = client.recognize_pdf("paper.pdf", dpi=150, include_blocks=True)
    print(render_to_markdown(pdf).markdown)

    # Searchable PDF (invisible text overlay)
    overlay = client.make_searchable_pdf("scan.pdf", dpi=200)
    open("scan.searchable.pdf", "wb").write(overlay)

That's the 80% case. Full runnable examples for async, gRPC, batch, retries, custom httpx.Client, hooks, Markdown styling, folder pipelines, and more live in examples/ — every script runs end-to-end against the bundled ACME invoice fixture.

What you get

  • Sync + async, HTTP + gRPC. Four clients (Client, AsyncClient, GrpcClient, AsyncGrpcClient) with identical method surfaces.
  • Typed, immutable responses (pydantic v2). IDE autocomplete, and if a newer server adds a field your SDK doesn't know about, parsing still succeeds — the extra lands on .model_extra instead of crashing.
  • Layout-aware Markdown. render_to_markdown(...) walks the reading order and maps each layout class (doc_title, display_formula, table, …) to a Markdown construct. Pluggable via MarkdownStyle.
  • Searchable PDFs. make_searchable_pdf(...) overlays an invisible text layer aligned to the page geometry. Auto-discovers a Unicode font for non-Latin scripts, or pass font_path=.
  • Production-friendly. Configurable retry policy (HTTP status + gRPC status
    • Retry-After), per-request timeouts, custom httpx.Client, on_request / on_response event hooks, uuid7 X-Request-ID per call.
  • Precise exception hierarchy. Maps the server's error_code to typed exceptions — see Errors.
  • turbo-ocr CLI included in the default install.

Today's server does plain OCR + layout classification. Table-structure and LaTeX-formula source are not yet emitted; the SDK exposes page.tables / page.formulas as a forward-compatible surface that populates automatically when those server features ship.

Configuration

from turboocr import Client, RetryPolicy

client = Client(
    base_url="http://localhost:8000",   # or TURBO_OCR_BASE_URL env
    api_key="sk-...",                   # or TURBO_OCR_API_KEY env
    auth_scheme="bearer",               # "bearer" | "x-api-key"
    timeout=30.0,
    default_headers={"X-Tenant": "acme"},
    retry=RetryPolicy(attempts=5, backoff=0.5),
)

Pass http_client=httpx.Client(...) for custom TLS, connection limits, or proxies — see examples/08_custom_httpx_client.py.

Retry defaults: HTTP {429, 502, 503, 504}, gRPC {UNAVAILABLE, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED}, 3 attempts, exponential backoff + jitter, Retry-After honoured. Tune via RetryPolicy(...) — see examples/07_retry_and_timeout.py.

Errors

TurboOcrError
├── APIConnectionError       # transport-level
│   ├── Timeout
│   ├── NetworkError
│   └── ProtocolError
├── InvalidParameter         # 4xx: bad params / headers / dims
├── EmptyBody                # 4xx: empty body / batch / PDF
├── LayoutDisabled           # asked for layout when server has it off
├── ImageDecodeError         # bad bytes / bad base64
├── DimensionsTooLarge       # image / PDF over server limits
├── PoolExhausted            # "Server at capacity"
├── PdfRenderError           # PDF rasterization failed
└── ServerError              # 5xx, no specific code

Server-side exceptions carry .code, .status_code, and .payload. Transport exceptions inherit from APIConnectionError.

Symptom Cause Fix
NetworkError: Connection refused server not running start the docker container (above)
DimensionsTooLarge image > MAX_IMAGE_DIM (default 16384) downscale, or raise the server limit
LayoutDisabled server started with DISABLE_LAYOUT=1 restart without that env var
UnicodeFontRequired non-Latin text, no Unicode font found pass font_path= or set TURBO_OCR_FONT
PoolExhausted server queue full retry with backoff, or scale PIPELINE_POOL_SIZE
Timeout per-request timeout hit pass timeout=N, or raise RetryPolicy.attempts

CLI

turbo-ocr ocr page.png --output markdown
turbo-ocr pdf doc.pdf --dpi 150 --output json
turbo-ocr searchable-pdf doc.pdf -o out.pdf --font-path /path/to/font.ttf
turbo-ocr health --ready

--output accepts json | blocks | text | markdown. Reads TURBO_OCR_BASE_URL, TURBO_OCR_API_KEY, TURBO_OCR_FONT from the environment. Run turbo-ocr --help for the full surface.

Logging

import logging
logging.getLogger("turboocr").setLevel(logging.DEBUG)

Emits method path -> status (Xms) [req=<short-id>] per HTTP request. Retry warnings go to turboocr.retry / turboocr.grpc.retry. Searchable-PDF font resolution logs to turboocr.searchable_pdf. Every HTTP request sends a uuid7 X-Request-ID header (gRPC uses x-request-id metadata).

Learn more

  • examples/ — 13 runnable scripts (each runs against the bundled ACME invoice fixture, no server config needed beyond TURBO_OCR_BASE_URL)
  • docs/ — full docs source (MkDocs + mkdocstrings, deployed at https://aiptimizer.github.io/turboocr-python/). Preview locally with uv run --extra docs mkdocs serve -f docs/mkdocs.yml
  • Server compatibility: SERVER_API_VERSION_MIN / SERVER_API_VERSION_MAX_EXCLUSIVE document the supported server range; extra="allow" on response models means additive server changes don't break parsing

Testing

pytest -q                                                # offline (respx)
TURBO_OCR_BASE_URL=http://localhost:8000 pytest tests/integration -v
python examples/03_searchable_pdf.py                         # smoke test

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turboocr-0.1.0.tar.gz (249.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

turboocr-0.1.0-py3-none-any.whl (57.8 kB view details)

Uploaded Python 3

File details

Details for the file turboocr-0.1.0.tar.gz.

File metadata

  • Download URL: turboocr-0.1.0.tar.gz
  • Upload date:
  • Size: 249.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for turboocr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 462d3c947c54c104a4fb9ae356c51e8f8097447d0af5388309310c9cc07bafb4
MD5 800908d8ff00128e60f8739c52b78585
BLAKE2b-256 0ce7265facb92154cc4223f34bc0ec8d35df802d52a2856fa134f04dcc00e44e

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboocr-0.1.0.tar.gz:

Publisher: release.yml on aiptimizer/TurboOCR-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboocr-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: turboocr-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 57.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for turboocr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 add379876161b7ec6b1487feb8d9e1a339d987254e9fecebb7d709ade955d696
MD5 e33ee29711150fc7a9de69ceccc239e6
BLAKE2b-256 a3588287d110084c37777e0292a15bbdfbe6e2b77218fec02cddc478977f0e42

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboocr-0.1.0-py3-none-any.whl:

Publisher: release.yml on aiptimizer/TurboOCR-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page