A Python async SDK that wraps the PaddleOCR AI Studio API into a clean, type-safe interface.
Project description
paddleocr-api-python
English
A Python async SDK that wraps the PaddleOCR AI Studio API into a clean, type-safe interface. Upload a document, await the result, and get Markdown back — without touching raw HTTP.
Features
- Async-first — built on
httpx.AsyncClientandasyncio, with native context manager support. - Full model coverage —
PaddleOCR-VL-1.6(default),PaddleOCR-VL-1.5,PaddleOCR-VL,PP-OCRv5, andPaddleOCR. - Flexible input — submit by local file path, raw bytes, or remote URL.
- Rich job control — poll real-time state, extracted page count, start/end times, and error messages.
- Markdown export — get a clean Markdown document plus the URLs of all embedded images.
- Fine-grained options — toggle layout detection, chart/seal/table recognition, cross-page table merging, title leveling, NMS, image orientation correction, and more.
Installation
pip install paddleocr-api-python
Dependencies: aiofiles, httpx, typing-extensions, python-dotenv.
Authentication
Get an access token from https://aistudio.baidu.com/account/accessToken.
Either pass it explicitly:
client = AistudioClient(api_key="your_token_here")
Or set it via environment variable (a .env file is loaded automatically):
AISTUDIO_ACCESS_TOKEN=your_token_here
Quick Start
import asyncio
from paddleocr_api import AistudioClient, State
async def main():
async with AistudioClient() as client:
job = await client.create_job(file_path="paper.pdf")
async with job:
while True:
state = await job.state
if state == State.DONE:
break
if state == State.FAILED:
raise RuntimeError(await job.error_message)
await asyncio.sleep(5)
markdown = await job.markdown
with open("output.md", "w", encoding="utf-8") as f:
f.write(markdown.text)
asyncio.run(main())
Submitting Jobs
create_job accepts three mutually compatible input modes:
# From a local path
await client.create_job(file_path="doc.pdf")
# From bytes already in memory
await client.create_job(file_bytes=pdf_bytes)
# From a public URL
await client.create_job(file_url="https://example.com/doc.pdf")
Selecting a Model
from paddleocr_api import Model
await client.create_job(
file_path="doc.pdf",
model=Model.PADDLE_OCR_VL_1_6, # default
)
| Model | Notes |
|---|---|
PaddleOCR-VL-1.6 |
Default. Latest vision-language model. |
PaddleOCR-VL-1.5 |
Scheduled for retirement on 2026-06-17. |
PaddleOCR-VL |
Base VL model. |
PP-OCRv5 |
Classic OCR pipeline. |
PaddleOCR |
Base OCR. |
Optional Payload
Pass an OptionalPayload dict to fine-tune recognition behavior:
from paddleocr_api import LayoutShapeMode, PromptLabel
await client.create_job(
file_path="doc.pdf",
optional_payload={
"useLayoutDetection": True,
"useChartRecognition": True,
"useSealRecognition": True,
"mergeTables": True,
"relevelTitles": True,
"layoutShapeMode": LayoutShapeMode.AUTO,
"repetitionPenalty": 1.0,
"temperature": 0.0,
"topP": 1.0,
},
)
Key options:
| Field | Default | Purpose |
|---|---|---|
useDocOrientationClassify |
False |
Auto-correct 0/90/180/270° rotation. |
useDocUnwarping |
False |
Flatten warped or wrinkled pages. |
useLayoutDetection |
True |
Region-aware parsing. Disable for single-region docs. |
useChartRecognition |
False |
Convert charts to tables. |
useSealRecognition |
True |
Extract seal text. |
useOcrForImageBlock |
False |
OCR inside image regions. |
mergeTables |
True |
Merge tables that span pages. |
relevelTitles |
True |
Infer heading hierarchy. |
repetitionPenalty |
1.0 |
Raise to suppress repeated output. |
temperature |
0.0 |
Lower for stability, higher to reduce omissions. |
topP |
1.0 |
Lower for more conservative output. |
layoutNms |
True |
Drop overlapping detection boxes. |
markdownIgnoreLabels |
all | Filter headers, footers, page numbers, footnotes, etc. |
Tracking a Job
async with job:
print(await job.state) # State.PENDING / RUNNING / DONE / FAILED
print(await job.total_pages) # e.g. 8
print(await job.extracted_pages) # e.g. 3
print(await job.start_time) # datetime
print(await job.end_time) # datetime
print(await job.error_message) # str or None
Status queries are cached for status_update_interval seconds (default 2) to avoid hammering the API.
Working with Results
result = await job.result # full Result object
markdown = await job.markdown # Markdown(text=..., images=...)
# Save Markdown
with open("doc.md", "w", encoding="utf-8") as f:
f.write(markdown.text)
# Download embedded images
import httpx
async with httpx.AsyncClient() as http:
for rel_path, url in markdown.images.items():
data = (await http.get(url)).content
# write `data` to `rel_path`
The Result object also exposes per-page layout details via layout_parsing_results, raw page sizes via data_info, and preprocessed image URLs via preprocessed_images.
Error Handling
All exceptions inherit from PaddleOCRError:
AistudioClientError— client configuration issues (e.g. missing token).JobCreationError— failure when submitting a job.JobStatusQueryError— failure when polling status.
Use job.query_status_safe() instead of query_status() to get the cached state on failure rather than raising.
License
中文
将 PaddleOCR AI Studio API 封装为简洁、类型安全的 Python 异步 SDK。上传文档、等待结果、拿到 Markdown —— 无需手写任何 HTTP 请求。
特性
- 异步优先 —— 基于
httpx.AsyncClient与asyncio构建,原生支持上下文管理器。 - 全模型支持 ——
PaddleOCR-VL-1.6(默认)、PaddleOCR-VL-1.5、PaddleOCR-VL、PP-OCRv5、PaddleOCR。 - 灵活输入 —— 支持本地路径、字节流、远程 URL 三种提交方式。
- 完善的任务控制 —— 实时查询状态、已抽取页数、起止时间、错误信息。
- Markdown 导出 —— 直接获取整洁的 Markdown 文本及所有内嵌图片 URL。
- 细粒度参数 —— 可控制版面分析、图表/印章/表格识别、跨页表格合并、标题分级、NMS、图像方向矫正等。
安装
pip install paddleocr-api-python
依赖:aiofiles、httpx、typing-extensions、python-dotenv。
身份验证
在 https://aistudio.baidu.com/account/accessToken 获取访问令牌。
可以显式传入:
client = AistudioClient(api_key="your_token_here")
也可以通过环境变量传入(自动加载 .env 文件):
AISTUDIO_ACCESS_TOKEN=your_token_here
快速上手
import asyncio
from paddleocr_api import AistudioClient, State
async def main():
async with AistudioClient() as client:
job = await client.create_job(file_path="paper.pdf")
async with job:
while True:
state = await job.state
if state == State.DONE:
break
if state == State.FAILED:
raise RuntimeError(await job.error_message)
await asyncio.sleep(5)
markdown = await job.markdown
with open("output.md", "w", encoding="utf-8") as f:
f.write(markdown.text)
asyncio.run(main())
提交任务
create_job 支持三种输入方式:
# 本地路径
await client.create_job(file_path="doc.pdf")
# 内存字节流
await client.create_job(file_bytes=pdf_bytes)
# 公网 URL
await client.create_job(file_url="https://example.com/doc.pdf")
选择模型
from paddleocr_api import Model
await client.create_job(
file_path="doc.pdf",
model=Model.PADDLE_OCR_VL_1_6, # 默认
)
| 模型 | 备注 |
|---|---|
PaddleOCR-VL-1.6 |
默认,最新视觉语言模型。 |
PaddleOCR-VL-1.5 |
计划于 2026-06-17 下线。 |
PaddleOCR-VL |
基础 VL 模型。 |
PP-OCRv5 |
经典 OCR 流水线。 |
PaddleOCR |
基础 OCR。 |
可选参数
通过 OptionalPayload 字典精调识别行为:
from paddleocr_api import LayoutShapeMode, PromptLabel
await client.create_job(
file_path="doc.pdf",
optional_payload={
"useLayoutDetection": True,
"useChartRecognition": True,
"useSealRecognition": True,
"mergeTables": True,
"relevelTitles": True,
"layoutShapeMode": LayoutShapeMode.AUTO,
"repetitionPenalty": 1.0,
"temperature": 0.0,
"topP": 1.0,
},
)
常用参数:
| 字段 | 默认值 | 作用 |
|---|---|---|
useDocOrientationClassify |
False |
自动矫正 0/90/180/270° 旋转。 |
useDocUnwarping |
False |
矫正褶皱、倾斜等扭曲图像。 |
useLayoutDetection |
True |
版面分区与排序。文档仅含单一区域时可关闭。 |
useChartRecognition |
False |
将图表解析为表格。 |
useSealRecognition |
True |
识别印章文字。 |
useOcrForImageBlock |
False |
对图片区域中的文字进行 OCR。 |
mergeTables |
True |
合并跨页表格。 |
relevelTitles |
True |
识别段落标题级别。 |
repetitionPenalty |
1.0 |
出现重复内容时可调高。 |
temperature |
0.0 |
调低更稳定,调高减少漏识别。 |
topP |
1.0 |
调低让模型更保守。 |
layoutNms |
True |
移除重叠的检测框。 |
markdownIgnoreLabels |
全部 | 过滤页眉、页脚、页码、脚注等辅助元素。 |
追踪任务
async with job:
print(await job.state) # State.PENDING / RUNNING / DONE / FAILED
print(await job.total_pages) # 如 8
print(await job.extracted_pages) # 如 3
print(await job.start_time) # datetime
print(await job.end_time) # datetime
print(await job.error_message) # str 或 None
状态查询带有 status_update_interval 秒的缓存(默认 2 秒),避免频繁请求。
处理结果
result = await job.result # 完整的 Result 对象
markdown = await job.markdown # Markdown(text=..., images=...)
# 保存 Markdown
with open("doc.md", "w", encoding="utf-8") as f:
f.write(markdown.text)
# 下载内嵌图片
import httpx
async with httpx.AsyncClient() as http:
for rel_path, url in markdown.images.items():
data = (await http.get(url)).content
# 将 data 写入 rel_path
Result 对象还通过 layout_parsing_results 暴露每页的版面细节,通过 data_info 提供原始页面尺寸,通过 preprocessed_images 提供预处理图像 URL。
异常处理
所有异常都继承自 PaddleOCRError:
AistudioClientError—— 客户端配置错误(如缺少令牌)。JobCreationError—— 任务提交失败。JobStatusQueryError—— 状态查询失败。
如果希望查询失败时返回缓存而非抛出异常,使用 job.query_status_safe() 代替 query_status()。
许可证
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file paddleocr_api_python-0.0.2.tar.gz.
File metadata
- Download URL: paddleocr_api_python-0.0.2.tar.gz
- Upload date:
- Size: 22.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00ca1d1f64eb98e4c6d7bfcd0d40d4ddb83b3485a4ca83807098b5b1e89b134b
|
|
| MD5 |
ef888800e499d6d49d41a00837cd6494
|
|
| BLAKE2b-256 |
99da13df9edf1b9b16c11bc41854dbd899c13cc49f1aaf86b1540bd0eae670f9
|
File details
Details for the file paddleocr_api_python-0.0.2-py3-none-any.whl.
File metadata
- Download URL: paddleocr_api_python-0.0.2-py3-none-any.whl
- Upload date:
- Size: 21.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e6b37b2992440e631d2f3d1072df117fb9aa10f9480b519bbd7d203858bb43a
|
|
| MD5 |
8e49f7c1366bf56be98c1acf91c7d67f
|
|
| BLAKE2b-256 |
9bc839f238c0fbde39246270db7223af932aca2d5534c4db855a6e998ce4bf55
|