Skip to main content

Python SDK for TurboOCR — fast GPU OCR server (HTTP + gRPC)

Project description

turboocr

Typed Python client for the TurboOCR server. Sync + async, HTTP + gRPC, layout-aware Markdown rendering, searchable-PDF generation.

PyPI Python Typed License: MIT

Install

pip install turboocr             # HTTP client + CLI + searchable-PDF
pip install 'turboocr[grpc]'     # add the gRPC transport
pip install 'turboocr[all]'      # everything optional (currently == [grpc])

Requires Python 3.12+.

Quickstart

Start a TurboOCR server (the C++/CUDA OCR engine — this repo is just the Python client):

docker run --gpus all -p 8000:8000 -p 50051:50051 \
  -v trt-cache:/home/ocr/.cache/turbo-ocr \
  -e OCR_LANG=latin \
  ghcr.io/aiptimizer/turboocr:v2.2.3

OCR_LANG=latin (default) covers English, French, German, Spanish, …. Swap for chinese, greek, eslav, arabic, korean, or thai — all are baked in. See the TurboOCR repo for build-from-source, benchmarks, and the full set of server env vars.

Then recognise an image and turn a PDF into Markdown:

from turboocr import Client, render_to_markdown

with Client(base_url="http://localhost:8000") as client:
    # Image OCR
    img = client.recognize_image("page.png", layout=True, include_blocks=True)
    print(f"{len(img.results)} text items, {len(img.blocks)} blocks")
    print(img.text)

    # PDF → Markdown
    pdf = client.recognize_pdf("paper.pdf", dpi=150, include_blocks=True)
    print(render_to_markdown(pdf).markdown)

    # Searchable PDF (invisible text overlay)
    overlay = client.make_searchable_pdf("scan.pdf", dpi=200)
    open("scan.searchable.pdf", "wb").write(overlay)

That's the 80% case. Full runnable examples for async, gRPC, batch, retries, custom httpx.Client, hooks, Markdown styling, folder pipelines, and more live in examples/ — every script runs end-to-end against the bundled ACME invoice fixture.

What you get

  • Sync + async, HTTP + gRPC. Four clients (Client, AsyncClient, GrpcClient, AsyncGrpcClient) with identical method surfaces.
  • Typed, immutable responses (pydantic v2). IDE autocomplete, and if a newer server adds a field your SDK doesn't know about, parsing still succeeds — the extra lands on .model_extra instead of crashing.
  • Layout-aware Markdown. render_to_markdown(...) walks the reading order and maps each layout class (doc_title, display_formula, table, …) to a Markdown construct. Pluggable via MarkdownStyle.
  • Searchable PDFs. make_searchable_pdf(...) overlays an invisible text layer aligned to the page geometry. Auto-discovers a Unicode font for non-Latin scripts, or pass font_path=.
  • Production-friendly. Configurable retry policy (HTTP status + gRPC status
    • Retry-After), per-request timeouts, custom httpx.Client, on_request / on_response event hooks, uuid7 X-Request-ID per call.
  • Precise exception hierarchy. Maps the server's error_code to typed exceptions — see Errors.
  • turbo-ocr CLI included in the default install.

Today's server does plain OCR + layout classification. Table-structure and LaTeX-formula source are not yet emitted; the SDK exposes page.tables / page.formulas as a forward-compatible surface that populates automatically when those server features ship.

Configuration

from turboocr import Client, RetryPolicy

client = Client(
    base_url="http://localhost:8000",   # or TURBO_OCR_BASE_URL env
    api_key="sk-...",                   # or TURBO_OCR_API_KEY env
    auth_scheme="bearer",               # "bearer" | "x-api-key"
    timeout=30.0,
    default_headers={"X-Tenant": "acme"},
    retry=RetryPolicy(attempts=5, backoff=0.5),
)

Pass http_client=httpx.Client(...) for custom TLS, connection limits, or proxies — see examples/08_custom_httpx_client.py.

Retry defaults: HTTP {429, 502, 503, 504}, gRPC {UNAVAILABLE, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED}, 3 attempts, exponential backoff + jitter, Retry-After honoured. Tune via RetryPolicy(...) — see examples/07_retry_and_timeout.py.

Errors

TurboOcrError
├── APIConnectionError       # transport-level
│   ├── Timeout
│   ├── NetworkError
│   └── ProtocolError
├── InvalidParameter         # 4xx: bad params / headers / dims
├── EmptyBody                # 4xx: empty body / batch / PDF
├── LayoutDisabled           # asked for layout when server has it off
├── ImageDecodeError         # bad bytes / bad base64
├── DimensionsTooLarge       # image / PDF over server limits
├── PoolExhausted            # "Server at capacity"
├── PdfRenderError           # PDF rasterization failed
└── ServerError              # 5xx, no specific code

Server-side exceptions carry .code, .status_code, and .payload. Transport exceptions inherit from APIConnectionError.

Symptom Cause Fix
NetworkError: Connection refused server not running start the docker container (above)
DimensionsTooLarge image > MAX_IMAGE_DIM (default 16384) downscale, or raise the server limit
LayoutDisabled server started with DISABLE_LAYOUT=1 restart without that env var
PoolExhausted server queue full retry with backoff, or scale PIPELINE_POOL_SIZE
Timeout per-request timeout hit pass timeout=N, or raise RetryPolicy.attempts

CLI

turbo-ocr ocr page.png --output markdown
turbo-ocr pdf doc.pdf --dpi 150 --output json
turbo-ocr searchable-pdf doc.pdf -o out.pdf --font-path /path/to/font.ttf
turbo-ocr health --ready

--output accepts json | blocks | text | markdown. Reads TURBO_OCR_BASE_URL and TURBO_OCR_API_KEY from the environment. Run turbo-ocr --help for the full surface.

Logging

import logging
logging.getLogger("turboocr").setLevel(logging.DEBUG)

Emits method path -> status (Xms) [req=<short-id>] per HTTP request. Retry warnings go to turboocr.retry / turboocr.grpc.retry. Searchable-PDF font resolution logs to turboocr.searchable_pdf. Every HTTP request sends a uuid7 X-Request-ID header (gRPC uses x-request-id metadata).

Learn more

  • examples/ — 13 runnable scripts (each runs against the bundled ACME invoice fixture, no server config needed beyond TURBO_OCR_BASE_URL)
  • docs/ — full docs source (MkDocs + mkdocstrings, deployed at https://aiptimizer.github.io/TurboOCR-python/). Preview locally with uv run --extra docs mkdocs serve -f docs/mkdocs.yml
  • Server compatibility: SERVER_API_VERSION_MIN / SERVER_API_VERSION_MAX_EXCLUSIVE document the supported server range; extra="allow" on response models means additive server changes don't break parsing

Testing

pytest -q                                                # offline (respx)
TURBO_OCR_BASE_URL=http://localhost:8000 pytest tests/integration -v
python examples/03_searchable_pdf.py                         # smoke test

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turboocr-0.2.0.tar.gz (251.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

turboocr-0.2.0-py3-none-any.whl (59.6 kB view details)

Uploaded Python 3

File details

Details for the file turboocr-0.2.0.tar.gz.

File metadata

  • Download URL: turboocr-0.2.0.tar.gz
  • Upload date:
  • Size: 251.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for turboocr-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e9c107137870f2f143d089f98416d37e1378baefa7649d310fb6c9c2869cbdf7
MD5 5eb9201c8bb5fe82749b76888d7d4b06
BLAKE2b-256 8ffa45f193d7a4483e64b124ae95ef592a7452118c5d76238729c5c90ea7da4e

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboocr-0.2.0.tar.gz:

Publisher: release.yml on aiptimizer/TurboOCR-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file turboocr-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: turboocr-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 59.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for turboocr-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 628a331e432e4bc048f816ad21ac1ffc7d4e7f3157133735c9e443ebb7458e82
MD5 5b30fbf93d4ce90ee61440b1da5c4dc7
BLAKE2b-256 7963a3ad18f268efa2c1545dff49c0a2632872cfbb6f3d316342ad533c6cb3ac

See more details on using hashes here.

Provenance

The following attestation bundles were made for turboocr-0.2.0-py3-none-any.whl:

Publisher: release.yml on aiptimizer/TurboOCR-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page