Python SDK for TurboOCR — fast GPU OCR server (HTTP + gRPC)
Project description
turboocr
Typed Python client for the TurboOCR server. Sync + async, HTTP + gRPC, layout-aware Markdown rendering, searchable-PDF generation.
- Install · Quickstart · What you get
- Examples · API reference · CLI · Errors
Install
pip install turboocr # HTTP client + CLI + searchable-PDF
pip install 'turboocr[grpc]' # add the gRPC transport
pip install 'turboocr[all]' # everything optional (currently == [grpc])
Requires Python 3.12+.
Quickstart
Start a TurboOCR server (the C++/CUDA OCR engine — this repo is just the Python client):
docker run --gpus all -p 8000:8000 -p 50051:50051 \
-v trt-cache:/home/ocr/.cache/turbo-ocr \
-e OCR_LANG=latin \
ghcr.io/aiptimizer/turboocr:v2.2.3
OCR_LANG=latin (default) covers English, French, German, Spanish, …. Swap for
chinese, greek, eslav, arabic, korean, or thai — all are baked in.
See the TurboOCR repo for build-from-source,
benchmarks, and the full set of server env vars.
Then recognise an image and turn a PDF into Markdown:
from turboocr import Client, render_to_markdown
with Client(base_url="http://localhost:8000") as client:
# Image OCR
img = client.recognize_image("page.png", layout=True, include_blocks=True)
print(f"{len(img.results)} text items, {len(img.blocks)} blocks")
print(img.text)
# PDF → Markdown
pdf = client.recognize_pdf("paper.pdf", dpi=150, include_blocks=True)
print(render_to_markdown(pdf).markdown)
# Searchable PDF (invisible text overlay)
overlay = client.make_searchable_pdf("scan.pdf", dpi=200)
open("scan.searchable.pdf", "wb").write(overlay)
That's the 80% case. Full runnable examples for async, gRPC, batch, retries,
custom httpx.Client, hooks, Markdown styling, folder pipelines, and more live
in examples/ — every script runs end-to-end against the bundled
ACME invoice fixture.
What you get
- Sync + async, HTTP + gRPC. Four clients (
Client,AsyncClient,GrpcClient,AsyncGrpcClient) with identical method surfaces. - Typed, immutable responses (pydantic v2). IDE autocomplete, and if a newer
server adds a field your SDK doesn't know about, parsing still succeeds — the
extra lands on
.model_extrainstead of crashing. - Layout-aware Markdown.
render_to_markdown(...)walks the reading order and maps each layout class (doc_title,display_formula,table, …) to a Markdown construct. Pluggable viaMarkdownStyle. - Searchable PDFs.
make_searchable_pdf(...)overlays an invisible text layer aligned to the page geometry. Auto-discovers a Unicode font for non-Latin scripts, or passfont_path=. - Production-friendly. Configurable retry policy (HTTP status + gRPC status
Retry-After), per-request timeouts, customhttpx.Client,on_request/on_responseevent hooks, uuid7X-Request-IDper call.
- Precise exception hierarchy. Maps the server's
error_codeto typed exceptions — see Errors. turbo-ocrCLI included in the default install.
Today's server does plain OCR + layout classification. Table-structure and
LaTeX-formula source are not yet emitted; the SDK exposes page.tables /
page.formulas as a forward-compatible surface that populates automatically
when those server features ship.
Configuration
from turboocr import Client, RetryPolicy
client = Client(
base_url="http://localhost:8000", # or TURBO_OCR_BASE_URL env
api_key="sk-...", # or TURBO_OCR_API_KEY env
auth_scheme="bearer", # "bearer" | "x-api-key"
timeout=30.0,
default_headers={"X-Tenant": "acme"},
retry=RetryPolicy(attempts=5, backoff=0.5),
)
Pass http_client=httpx.Client(...) for custom TLS, connection limits, or
proxies — see examples/08_custom_httpx_client.py.
Retry defaults: HTTP {429, 502, 503, 504}, gRPC
{UNAVAILABLE, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED}, 3 attempts, exponential
backoff + jitter, Retry-After honoured. Tune via RetryPolicy(...) — see
examples/07_retry_and_timeout.py.
Errors
TurboOcrError
├── APIConnectionError # transport-level
│ ├── Timeout
│ ├── NetworkError
│ └── ProtocolError
├── InvalidParameter # 4xx: bad params / headers / dims
├── EmptyBody # 4xx: empty body / batch / PDF
├── LayoutDisabled # asked for layout when server has it off
├── ImageDecodeError # bad bytes / bad base64
├── DimensionsTooLarge # image / PDF over server limits
├── PoolExhausted # "Server at capacity"
├── PdfRenderError # PDF rasterization failed
└── ServerError # 5xx, no specific code
Server-side exceptions carry .code, .status_code, and .payload. Transport
exceptions inherit from APIConnectionError.
| Symptom | Cause | Fix |
|---|---|---|
NetworkError: Connection refused |
server not running | start the docker container (above) |
DimensionsTooLarge |
image > MAX_IMAGE_DIM (default 16384) |
downscale, or raise the server limit |
LayoutDisabled |
server started with DISABLE_LAYOUT=1 |
restart without that env var |
PoolExhausted |
server queue full | retry with backoff, or scale PIPELINE_POOL_SIZE |
Timeout |
per-request timeout hit | pass timeout=N, or raise RetryPolicy.attempts |
CLI
turbo-ocr ocr page.png --output markdown
turbo-ocr pdf doc.pdf --dpi 150 --output json
turbo-ocr searchable-pdf doc.pdf -o out.pdf --font-path /path/to/font.ttf
turbo-ocr health --ready
--output accepts json | blocks | text | markdown. Reads TURBO_OCR_BASE_URL
and TURBO_OCR_API_KEY from the environment. Run turbo-ocr --help
for the full surface.
Logging
import logging
logging.getLogger("turboocr").setLevel(logging.DEBUG)
Emits method path -> status (Xms) [req=<short-id>] per HTTP request. Retry
warnings go to turboocr.retry / turboocr.grpc.retry. Searchable-PDF font
resolution logs to turboocr.searchable_pdf. Every HTTP request sends a uuid7
X-Request-ID header (gRPC uses x-request-id metadata).
Learn more
examples/— 13 runnable scripts (each runs against the bundled ACME invoice fixture, no server config needed beyondTURBO_OCR_BASE_URL)docs/— full docs source (MkDocs + mkdocstrings, deployed at https://aiptimizer.github.io/TurboOCR-python/). Preview locally withuv run --extra docs mkdocs serve -f docs/mkdocs.yml- Server compatibility:
SERVER_API_VERSION_MIN/SERVER_API_VERSION_MAX_EXCLUSIVEdocument the supported server range;extra="allow"on response models means additive server changes don't break parsing
Testing
pytest -q # offline (respx)
TURBO_OCR_BASE_URL=http://localhost:8000 pytest tests/integration -v
python examples/03_searchable_pdf.py # smoke test
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file turboocr-0.2.0.tar.gz.
File metadata
- Download URL: turboocr-0.2.0.tar.gz
- Upload date:
- Size: 251.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9c107137870f2f143d089f98416d37e1378baefa7649d310fb6c9c2869cbdf7
|
|
| MD5 |
5eb9201c8bb5fe82749b76888d7d4b06
|
|
| BLAKE2b-256 |
8ffa45f193d7a4483e64b124ae95ef592a7452118c5d76238729c5c90ea7da4e
|
Provenance
The following attestation bundles were made for turboocr-0.2.0.tar.gz:
Publisher:
release.yml on aiptimizer/TurboOCR-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
turboocr-0.2.0.tar.gz -
Subject digest:
e9c107137870f2f143d089f98416d37e1378baefa7649d310fb6c9c2869cbdf7 - Sigstore transparency entry: 1518203346
- Sigstore integration time:
-
Permalink:
aiptimizer/TurboOCR-python@54f2c5439f554a28eb8b416124c7045490000a1a -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/aiptimizer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@54f2c5439f554a28eb8b416124c7045490000a1a -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file turboocr-0.2.0-py3-none-any.whl.
File metadata
- Download URL: turboocr-0.2.0-py3-none-any.whl
- Upload date:
- Size: 59.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
628a331e432e4bc048f816ad21ac1ffc7d4e7f3157133735c9e443ebb7458e82
|
|
| MD5 |
5b30fbf93d4ce90ee61440b1da5c4dc7
|
|
| BLAKE2b-256 |
7963a3ad18f268efa2c1545dff49c0a2632872cfbb6f3d316342ad533c6cb3ac
|
Provenance
The following attestation bundles were made for turboocr-0.2.0-py3-none-any.whl:
Publisher:
release.yml on aiptimizer/TurboOCR-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
turboocr-0.2.0-py3-none-any.whl -
Subject digest:
628a331e432e4bc048f816ad21ac1ffc7d4e7f3157133735c9e443ebb7458e82 - Sigstore transparency entry: 1518203418
- Sigstore integration time:
-
Permalink:
aiptimizer/TurboOCR-python@54f2c5439f554a28eb8b416124c7045490000a1a -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/aiptimizer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@54f2c5439f554a28eb8b416124c7045490000a1a -
Trigger Event:
workflow_dispatch
-
Statement type: