A Python async SDK that wraps the PaddleOCR AI Studio API into a clean, type-safe interface.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

paddleocr-api-python

English

A Python async SDK that wraps the PaddleOCR AI Studio API into a clean, type-safe interface. Upload a document, await the result, and get Markdown back — without touching raw HTTP.

Features

Async-first — built on httpx.AsyncClient and asyncio, with native context manager support.
Full model coverage — PaddleOCR-VL-1.6 (default), PaddleOCR-VL-1.5, PaddleOCR-VL, PP-OCRv5, and PaddleOCR.
Flexible input — submit by local file path, raw bytes, or remote URL.
Rich job control — poll real-time state, extracted page count, start/end times, and error messages.
Markdown export — get a clean Markdown document plus the URLs of all embedded images.
Fine-grained options — toggle layout detection, chart/seal/table recognition, cross-page table merging, title leveling, NMS, image orientation correction, and more.

Installation

pip install paddleocr-api-python

Dependencies: aiofiles, httpx, typing-extensions, python-dotenv.

Authentication

Get an access token from https://aistudio.baidu.com/account/accessToken.

Either pass it explicitly:

client = AistudioClient(api_key="your_token_here")

Or set it via environment variable (a .env file is loaded automatically):

AISTUDIO_ACCESS_TOKEN=your_token_here

Quick Start

import asyncio
from paddleocr_api import AistudioClient, State

async def main():
    async with AistudioClient() as client:
        job = await client.create_job(file_path="paper.pdf")

        async with job:
            while True:
                state = await job.state
                if state == State.DONE:
                    break
                if state == State.FAILED:
                    raise RuntimeError(await job.error_message)
                await asyncio.sleep(5)

            markdown = await job.markdown
            with open("output.md", "w", encoding="utf-8") as f:
                f.write(markdown.text)

asyncio.run(main())

Submitting Jobs

create_job accepts three mutually compatible input modes:

# From a local path
await client.create_job(file_path="doc.pdf")

# From bytes already in memory
await client.create_job(file_bytes=pdf_bytes)

# From a public URL
await client.create_job(file_url="https://example.com/doc.pdf")

Selecting a Model

from paddleocr_api import Model

await client.create_job(
    file_path="doc.pdf",
    model=Model.PADDLE_OCR_VL_1_6,  # default
)

Model	Notes
`PaddleOCR-VL-1.6`	Default. Latest vision-language model.
`PaddleOCR-VL-1.5`	Scheduled for retirement on 2026-06-17.
`PaddleOCR-VL`	Base VL model.
`PP-OCRv5`	Classic OCR pipeline.
`PaddleOCR`	Base OCR.

Optional Payload

Pass an OptionalPayload dict to fine-tune recognition behavior:

from paddleocr_api import LayoutShapeMode, PromptLabel

await client.create_job(
    file_path="doc.pdf",
    optional_payload={
        "useLayoutDetection": True,
        "useChartRecognition": True,
        "useSealRecognition": True,
        "mergeTables": True,
        "relevelTitles": True,
        "layoutShapeMode": LayoutShapeMode.AUTO,
        "repetitionPenalty": 1.0,
        "temperature": 0.0,
        "topP": 1.0,
    },
)

Key options:

Field	Default	Purpose
`useDocOrientationClassify`	`False`	Auto-correct 0/90/180/270° rotation.
`useDocUnwarping`	`False`	Flatten warped or wrinkled pages.
`useLayoutDetection`	`True`	Region-aware parsing. Disable for single-region docs.
`useChartRecognition`	`False`	Convert charts to tables.
`useSealRecognition`	`True`	Extract seal text.
`useOcrForImageBlock`	`False`	OCR inside image regions.
`mergeTables`	`True`	Merge tables that span pages.
`relevelTitles`	`True`	Infer heading hierarchy.
`repetitionPenalty`	`1.0`	Raise to suppress repeated output.
`temperature`	`0.0`	Lower for stability, higher to reduce omissions.
`topP`	`1.0`	Lower for more conservative output.
`layoutNms`	`True`	Drop overlapping detection boxes.
`markdownIgnoreLabels`	all	Filter headers, footers, page numbers, footnotes, etc.

Tracking a Job

async with job:
    print(await job.state)              # State.PENDING / RUNNING / DONE / FAILED
    print(await job.total_pages)        # e.g. 8
    print(await job.extracted_pages)    # e.g. 3
    print(await job.start_time)         # datetime
    print(await job.end_time)           # datetime
    print(await job.error_message)      # str or None

Status queries are cached for status_update_interval seconds (default 2) to avoid hammering the API.

Working with Results

result = await job.result          # full Result object
markdown = await job.markdown      # Markdown(text=..., images=...)

# Save Markdown
with open("doc.md", "w", encoding="utf-8") as f:
    f.write(markdown.text)

# Download embedded images
import httpx
async with httpx.AsyncClient() as http:
    for rel_path, url in markdown.images.items():
        data = (await http.get(url)).content
        # write `data` to `rel_path`

The Result object also exposes per-page layout details via layout_parsing_results, raw page sizes via data_info, and preprocessed image URLs via preprocessed_images.

Error Handling

All exceptions inherit from PaddleOCRError:

AistudioClientError — client configuration issues (e.g. missing token).
JobCreationError — failure when submitting a job.
JobStatusQueryError — failure when polling status.

Use job.query_status_safe() instead of query_status() to get the cached state on failure rather than raising.

License

Apache-2.0

中文

将 PaddleOCR AI Studio API 封装为简洁、类型安全的 Python 异步 SDK。上传文档、等待结果、拿到 Markdown —— 无需手写任何 HTTP 请求。

特性

异步优先 —— 基于 httpx.AsyncClient 与 asyncio 构建，原生支持上下文管理器。
全模型支持 —— PaddleOCR-VL-1.6（默认）、PaddleOCR-VL-1.5、PaddleOCR-VL、PP-OCRv5、PaddleOCR。
灵活输入 —— 支持本地路径、字节流、远程 URL 三种提交方式。
完善的任务控制 —— 实时查询状态、已抽取页数、起止时间、错误信息。
Markdown 导出 —— 直接获取整洁的 Markdown 文本及所有内嵌图片 URL。
细粒度参数 —— 可控制版面分析、图表/印章/表格识别、跨页表格合并、标题分级、NMS、图像方向矫正等。

安装

pip install paddleocr-api-python

依赖：aiofiles、httpx、typing-extensions、python-dotenv。

身份验证

在 https://aistudio.baidu.com/account/accessToken 获取访问令牌。

可以显式传入：

client = AistudioClient(api_key="your_token_here")

也可以通过环境变量传入（自动加载 .env 文件）：

AISTUDIO_ACCESS_TOKEN=your_token_here

快速上手

import asyncio
from paddleocr_api import AistudioClient, State

async def main():
    async with AistudioClient() as client:
        job = await client.create_job(file_path="paper.pdf")

        async with job:
            while True:
                state = await job.state
                if state == State.DONE:
                    break
                if state == State.FAILED:
                    raise RuntimeError(await job.error_message)
                await asyncio.sleep(5)

            markdown = await job.markdown
            with open("output.md", "w", encoding="utf-8") as f:
                f.write(markdown.text)

asyncio.run(main())

提交任务

create_job 支持三种输入方式：

# 本地路径
await client.create_job(file_path="doc.pdf")

# 内存字节流
await client.create_job(file_bytes=pdf_bytes)

# 公网 URL
await client.create_job(file_url="https://example.com/doc.pdf")

选择模型

from paddleocr_api import Model

await client.create_job(
    file_path="doc.pdf",
    model=Model.PADDLE_OCR_VL_1_6,  # 默认
)

模型	备注
`PaddleOCR-VL-1.6`	默认，最新视觉语言模型。
`PaddleOCR-VL-1.5`	计划于 2026-06-17 下线。
`PaddleOCR-VL`	基础 VL 模型。
`PP-OCRv5`	经典 OCR 流水线。
`PaddleOCR`	基础 OCR。

可选参数

通过 OptionalPayload 字典精调识别行为：

from paddleocr_api import LayoutShapeMode, PromptLabel

await client.create_job(
    file_path="doc.pdf",
    optional_payload={
        "useLayoutDetection": True,
        "useChartRecognition": True,
        "useSealRecognition": True,
        "mergeTables": True,
        "relevelTitles": True,
        "layoutShapeMode": LayoutShapeMode.AUTO,
        "repetitionPenalty": 1.0,
        "temperature": 0.0,
        "topP": 1.0,
    },
)

常用参数：

字段	默认值	作用
`useDocOrientationClassify`	`False`	自动矫正 0/90/180/270° 旋转。
`useDocUnwarping`	`False`	矫正褶皱、倾斜等扭曲图像。
`useLayoutDetection`	`True`	版面分区与排序。文档仅含单一区域时可关闭。
`useChartRecognition`	`False`	将图表解析为表格。
`useSealRecognition`	`True`	识别印章文字。
`useOcrForImageBlock`	`False`	对图片区域中的文字进行 OCR。
`mergeTables`	`True`	合并跨页表格。
`relevelTitles`	`True`	识别段落标题级别。
`repetitionPenalty`	`1.0`	出现重复内容时可调高。
`temperature`	`0.0`	调低更稳定，调高减少漏识别。
`topP`	`1.0`	调低让模型更保守。
`layoutNms`	`True`	移除重叠的检测框。
`markdownIgnoreLabels`	全部	过滤页眉、页脚、页码、脚注等辅助元素。

追踪任务

async with job:
    print(await job.state)              # State.PENDING / RUNNING / DONE / FAILED
    print(await job.total_pages)        # 如 8
    print(await job.extracted_pages)    # 如 3
    print(await job.start_time)         # datetime
    print(await job.end_time)           # datetime
    print(await job.error_message)      # str 或 None

状态查询带有 status_update_interval 秒的缓存（默认 2 秒），避免频繁请求。

处理结果

result = await job.result          # 完整的 Result 对象
markdown = await job.markdown      # Markdown(text=..., images=...)

# 保存 Markdown
with open("doc.md", "w", encoding="utf-8") as f:
    f.write(markdown.text)

# 下载内嵌图片
import httpx
async with httpx.AsyncClient() as http:
    for rel_path, url in markdown.images.items():
        data = (await http.get(url)).content
        # 将 data 写入 rel_path

Result 对象还通过 layout_parsing_results 暴露每页的版面细节，通过 data_info 提供原始页面尺寸，通过 preprocessed_images 提供预处理图像 URL。

异常处理

所有异常都继承自 PaddleOCRError：

AistudioClientError —— 客户端配置错误（如缺少令牌）。
JobCreationError —— 任务提交失败。
JobStatusQueryError —— 状态查询失败。

如果希望查询失败时返回缓存而非抛出异常，使用 job.query_status_safe() 代替 query_status()。

许可证

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.2

May 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paddleocr_api_python-0.0.2.tar.gz (22.9 kB view details)

Uploaded May 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

paddleocr_api_python-0.0.2-py3-none-any.whl (21.8 kB view details)

Uploaded May 30, 2026 Python 3

File details

Details for the file paddleocr_api_python-0.0.2.tar.gz.

File metadata

Download URL: paddleocr_api_python-0.0.2.tar.gz
Upload date: May 30, 2026
Size: 22.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for paddleocr_api_python-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`00ca1d1f64eb98e4c6d7bfcd0d40d4ddb83b3485a4ca83807098b5b1e89b134b`
MD5	`ef888800e499d6d49d41a00837cd6494`
BLAKE2b-256	`99da13df9edf1b9b16c11bc41854dbd899c13cc49f1aaf86b1540bd0eae670f9`

See more details on using hashes here.

File details

Details for the file paddleocr_api_python-0.0.2-py3-none-any.whl.

File metadata

Download URL: paddleocr_api_python-0.0.2-py3-none-any.whl
Upload date: May 30, 2026
Size: 21.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for paddleocr_api_python-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7e6b37b2992440e631d2f3d1072df117fb9aa10f9480b519bbd7d203858bb43a`
MD5	`8e49f7c1366bf56be98c1acf91c7d67f`
BLAKE2b-256	`9bc839f238c0fbde39246270db7223af932aca2d5534c4db855a6e998ce4bf55`

See more details on using hashes here.

paddleocr-api-python 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

paddleocr-api-python

English

Features

Installation

Authentication

Quick Start

Submitting Jobs

Selecting a Model

Optional Payload

Tracking a Job

Working with Results

Error Handling

License

中文

特性

安装

身份验证

快速上手

提交任务

选择模型

可选参数

追踪任务

处理结果

异常处理

许可证

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes