Python SDK for the MarkdownBridge OCR API — convert documents and images to Markdown

These details have not been verified by PyPI

Project links

Project description

markdownbridge

Python SDK for the MarkdownBridge OCR API — convert documents and images to Markdown.

Installation

pip install markdownbridge

Quick Start

from markdownbridge import MarkdownBridge

client = MarkdownBridge(api_key="ocrb_prd_xxx")

# One-liner: URL → Markdown
result = client.ocr("https://example.com/invoice.pdf")
print(result.markdown)

# One-liner: local file → Markdown
result = client.ocr("./receipt.png")
print(result.markdown)

Authentication

Pass your API key directly or set the MARKDOWNBRIDGE_API_KEY environment variable:

export MARKDOWNBRIDGE_API_KEY="ocrb_prd_xxx"

client = MarkdownBridge()  # reads from env

Client Options

client = MarkdownBridge(
    api_key="ocrb_prd_xxx",                        # or env MARKDOWNBRIDGE_API_KEY
    base_url="https://api.markdownbridge.com",      # default
    timeout=30.0,                                    # request timeout in seconds
    max_retries=3,                                   # retry 5xx errors with backoff
)

API Reference

`client.ocr(source, **opts)`

The convenience method — give it a URL or file path, get back a ProcessingResult.

result = client.ocr(
    "https://example.com/doc.pdf",
    language="en",
    output_format="markdown",
    enhance_quality=True,
    poll_interval=2.0,     # seconds between status checks
    poll_timeout=300.0,    # max wait time
)
print(result.markdown)
print(result.page_count)

`client.process_url(file_url, **opts)`

Submit a URL for processing without waiting for completion.

proc = client.process_url("https://example.com/doc.pdf")
print(proc.process_id)  # use with get_status() / wait_for_completion()

`client.process_file(file_path, **opts)`

Upload a local file and submit it for processing.

proc = client.process_file("./invoice.pdf")
print(proc.process_id)

`client.upload_file(file_path)`

Upload a file without processing it.

upload = client.upload_file("./photo.png")
print(upload.document_id)

`client.get_status(process_id)`

Check the current status of a processing job.

status = client.get_status("uuid-here")
print(status.status)   # queued | processing | completed | failed
print(status.progress)  # 0–100
print(status.stage)     # queued | download | ocr | llm_improvement | completed | failed

`client.wait_for_completion(process_id, **opts)`

Poll until the job completes or fails.

result = client.wait_for_completion(
    "uuid-here",
    poll_interval=2.0,
    poll_timeout=300.0,
    on_status_change=lambda s: print(f"Status: {s.status} ({s.stage})"),
)

`client.list_results(**filters)`

Fetch paginated results.

page = client.list_results(limit=20, offset=0, status="completed")
for item in page.data:
    print(item.file_name, item.status)
print(f"Total: {page.pagination.total}")

`client.iter_results(**filters)`

Auto-paginating iterator over all results.

for item in client.iter_results(status="completed"):
    print(item.file_name)

`client.get_result(result_id)`

Fetch a specific result by ID.

result = client.get_result("uuid-here")
print(result.result.markdown)

`client.info()`

Get API version and status.

info = client.info()
print(info.version, info.status)

Async Usage

Every method has an async equivalent via AsyncMarkdownBridge:

import asyncio
from markdownbridge import AsyncMarkdownBridge

async def main():
    async with AsyncMarkdownBridge(api_key="ocrb_prd_xxx") as client:
        result = await client.ocr("https://example.com/invoice.pdf")
        print(result.markdown)

        # Auto-paginating async iteration
        async for item in client.iter_results():
            print(item.file_name)

asyncio.run(main())

Error Handling

All exceptions inherit from MarkdownBridgeError and include status_code, error_code, and correlation_id:

from markdownbridge import MarkdownBridge, RateLimitError, AuthenticationError

client = MarkdownBridge(api_key="ocrb_prd_xxx")

try:
    result = client.ocr("https://example.com/doc.pdf")
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited — retry after {e.retry_after}s")
except MarkdownBridgeError as e:
    print(f"API error {e.status_code}: {e}")

Exception Hierarchy

Exception	HTTP Status	When
`AuthenticationError`	401	Invalid or missing API key
`ValidationError`	400/422	Invalid request parameters
`NotFoundError`	404	Resource not found
`RateLimitError`	429	Too many requests
`InsufficientCreditsError`	402	Account has no credits
`ServerError`	5xx	Server-side failure
`ProcessingError`	—	OCR job failed
`FileUploadError`	—	Upload failed
`TimeoutError`	—	Polling exceeded timeout

Data Types

All response types are frozen dataclasses:

ProcessResponse — process_id, status, file_id, stage
ProcessingStatus — process_id, status, progress, stage, result, error
ProcessingResult — text, markdown, json, page_count, processing_time
UploadResponse — file_key, public_url, document_id
ResultItem — id, process_id, file_name, status, result
ResultsPage — data, pagination
Pagination — total, limit, offset, has_more, next_offset
ApiInfo — version, status, endpoints

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Mar 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdownbridge-0.1.0.tar.gz (9.3 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

markdownbridge-0.1.0-py3-none-any.whl (14.7 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file markdownbridge-0.1.0.tar.gz.

File metadata

Download URL: markdownbridge-0.1.0.tar.gz
Upload date: Mar 17, 2026
Size: 9.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for markdownbridge-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6f912bc8a5688b768d03977594fd83c88799bc4cc9b505a29c61b568a75ddcb6`
MD5	`662d77c0945ea26416f2bc3f9ec9169c`
BLAKE2b-256	`04d23e6fe61681fa0a469eb71013b446cf419fbaf6ad3242dc210293d40b4c1f`

See more details on using hashes here.

File details

Details for the file markdownbridge-0.1.0-py3-none-any.whl.

File metadata

Download URL: markdownbridge-0.1.0-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 14.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for markdownbridge-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e4b62ae038080c1a2e0a38e74d0fcbde2539b9f3db1937c667048cc22667d8dc`
MD5	`de6460cde512d2f80ea50bde9b12b4e7`
BLAKE2b-256	`006900bdb22eb5f8900faae4a69fba3c2fcc995c62a3e8c90072b418a42c23d5`

See more details on using hashes here.

markdownbridge 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

markdownbridge

Installation

Quick Start

Authentication

Client Options

API Reference

client.ocr(source, **opts)

client.process_url(file_url, **opts)

client.process_file(file_path, **opts)

client.upload_file(file_path)

client.get_status(process_id)

client.wait_for_completion(process_id, **opts)

client.list_results(**filters)

client.iter_results(**filters)

client.get_result(result_id)

client.info()

Async Usage

Error Handling

Exception Hierarchy

Data Types

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`client.ocr(source, **opts)`

`client.process_url(file_url, **opts)`

`client.process_file(file_path, **opts)`

`client.upload_file(file_path)`

`client.get_status(process_id)`

`client.wait_for_completion(process_id, **opts)`

`client.list_results(**filters)`

`client.iter_results(**filters)`

`client.get_result(result_id)`

`client.info()`