文件翻译工具

Project description

Project Logo

DocuTranslate

A lightweight local file translation tool based on Large Language Models.

✅ Support Multiple Formats: Translates pdf, docx, xlsx, md, txt, json, epub, srt, ass, and more.
✅ Auto-Generate Glossary: Supports automatic glossary generation to ensure term alignment.
✅ PDF Table, Formula, Code Recognition: Leverages docling and mineru PDF parsing engines to recognize and translate tables, formulas, and code often found in academic papers.
✅ JSON Translation: Supports specifying values to translate within JSON using paths (jsonpath-ng syntax).
✅ Word/Excel Format Preservation: Supports docx and xlsx files (currently does not support doc or xls) while maintaining original formatting.
✅ Multi-AI Platform Support: Supports most AI platforms, allowing for high-performance concurrent AI translation with custom prompts.
✅ Async Support: Designed for high-performance scenarios, providing full asynchronous support and interfaces for parallel multi-tasking.
✅ LAN & Multi-user Support: Supports simultaneous use by multiple users within a local area network (LAN).
✅ Interactive Web Interface: Provides an out-of-the-box Web UI and RESTful API for easy integration and usage.
✅ Compact, Portable Packages: Windows and Mac portable packages under 40MB (versions that do not use docling for local PDF parsing).

When translating pdf, it is first converted to markdown. This will lose the original layout. Users with strict layout requirements should take note.

QQ Community Group: 1047781902

UI Interface:

Paper Translation:

Novel Translation:

Integration Packages

For users who want to get started quickly, we provide integration packages on GitHub Releases. Simply download, unzip, and enter your AI platform API-Key to start using it.

DocuTranslate: Standard version. Uses minerU (online or locally deployed) for PDF parsing. Supports local minerU API calls. (Recommended)
DocuTranslate_full: Full version. Includes the built-in docling local PDF parsing engine. Choose this version if you need offline PDF parsing without minerU.

Installation

Using pip

# Basic installation
pip install docutranslate

# If you need to use docling for local PDF parsing
pip install docutranslate[docling]

Using uv

# Initialize environment
uv init

# Basic installation
uv add docutranslate

# Install docling extension
uv add docutranslate[docling]

Using git

# Initialize environment
git clone https://github.com/xunbu/docutranslate.git

cd docutranslate

uv sync

Using docker

docker run -d -p 8010:8010 xunbu/docutranslate:latest
# docker run -it -p 8010:8010 xunbu/docutranslate:latest
# docker run -it -p 8010:8010 xunbu/docutranslate:v1.5.4

Core Concept: Workflow

DocuTranslate uses a Workflow system - each workflow is a complete translation pipeline for a specific file type.

Basic flow:

Select workflow based on file type
Configure the workflow (LLM, parsing engine, output format)
Execute translation
Save results

Start Web UI and API Service

For ease of use, DocuTranslate provides a fully functional Web Interface and RESTful API.

Start the Service:

# Start service, defaults to listening on port 8010
docutranslate -i

# Start on a specific port
docutranslate -i -p 8011

# Allow CORS requests
docutranslate -i --cors


# You can also specify the port via environment variable
export DOCUTRANSLATE_PORT=8011
docutranslate -i

Interactive Interface: After starting the service, please visit http://127.0.0.1:8010 (or your specified port) in your browser.
API Documentation: Full API documentation (Swagger UI) is located at http://127.0.0.1:8010/docs.

Usage Examples

Using the Simple Client SDK (Recommended)

The easiest way to get started is using the Client class, which provides a simple and intuitive API for translation:

from docutranslate.sdk import Client

# Initialize the client with your AI platform settings
client = Client(
    api_key="YOUR_OPENAI_API_KEY",  # or any other AI platform API key
    base_url="https://api.openai.com/v1/",
    model_id="gpt-4o",
    to_lang="Chinese",
    concurrent=10,  # Number of concurrent requests
)

# Translate a single file (auto-detects file type)
result = client.translate("path/to/your/document.pdf")

# Save with default format (PDF -> html by default)
print(f"Translation complete! Saved to: {result.save()}")

# Or specify output format explicitly
# For PDF/markdown_based:
#   - "markdown": Markdown with embedded base64 images (default)
#   - "markdown_zip": Markdown with separate image files (ZIP archive)
#   - "html": HTML format
# For docx: "docx"
# For xlsx: "xlsx"
result.save(fmt="html")  # Save as HTML
result.save(fmt="markdown")  # Save as Markdown with embedded images
result.save(fmt="markdown_zip")  # Save as ZIP with separate images

# Save to custom location
result.save(output_dir="./my_translations", name="my_document.html")

# Export as base64 encoded string
base64_content = result.export(fmt="html")
print(f"Exported content length: {len(base64_content)}")

# You can also access the underlying workflow for advanced operations
# workflow = result.workflow

Client Features:

Auto-detection: Automatically detects file type and selects the appropriate workflow
Flexible Configuration: Override any default settings per translation call
Multiple Output Options: Save to disk or export as Base64 string
Async Support: Use translate_async() for concurrent translation tasks

Client SDK Parameters

Parameter	Type	Default	Description
api_key	`str`	-	AI platform API key
base_url	`str`	-	AI platform base URL (e.g., `https://api.openai.com/v1/`)
model_id	`str`	-	Model ID to use for translation
to_lang	`str`	-	Target language (e.g., `"Chinese"`, `"English"`, `"Japanese"`)
concurrent	`int`	10	Number of concurrent LLM requests
convert_engine	`str`	`"mineru"`	PDF parsing engine: `"mineru"`, `"docling"`, `"mineru_deploy"`
mineru_deploy_base_url	`str`	-	Local minerU API address (when `convert_engine="mineru_deploy"`)
mineru_token	`str`	-	minerU API token (when using online minerU)
skip_translate	`bool`	`False`	Skip translation, only parse document
output_dir	`str`	`"./output"`	Default output directory for `save()`
chunk_size	`int`	3000	Text chunk size for LLM processing
temperature	`float`	0.3	LLM temperature parameter
timeout	`int`	60	Request timeout in seconds
retry	`int`	3	Number of retry attempts on failure
provider	`str`	`"auto"`	AI provider type (auto, openai, azure, etc.)
force_json	`bool`	`False`	Force JSON output mode
rpm	`int`	-	Requests per minute limit
tpm	`int`	-	Tokens per minute limit

Result Methods

Method	Parameters	Description
save()	`output_dir`, `name`, `fmt`	Save translation result to disk
export()	`fmt`	Export as Base64 encoded string
supported_formats	-	Get list of supported output formats
workflow	-	Access underlying workflow object

import asyncio
from docutranslate.sdk import Client

async def translate_multiple():
    client = Client(
        api_key="YOUR_API_KEY",
        base_url="https://api.openai.com/v1/",
        model_id="gpt-4o",
        to_lang="Chinese",
    )

    # Translate multiple files concurrently
    files = ["doc1.pdf", "doc2.docx", "notes.txt"]
    results = await asyncio.gather(
        *[client.translate_async(f) for f in files]
    )

    for r in results:
        print(f"Saved: {r.save()}")

asyncio.run(translate_multiple())

Using Workflow API (For Advanced Control)

For more control, use the Workflow API directly. Each workflow follows the same pattern:

# Pattern:
# 1. Create TranslatorConfig (LLM settings)
# 2. Create WorkflowConfig (workflow settings)
# 3. Create Workflow instance
# 4. workflow.read_path(file)
# 5. await workflow.translate_async()
# 6. workflow.save_as_*(name=...) or export_to_*(...)

Available Workflows and Output Methods

Workflow	Inputs	save_as_*	export_to_*	Key Config Options
MarkdownBasedWorkflow	`.pdf`, `.docx`, `.md`, `.png`, `.jpg`	`html`, `markdown`, `markdown_zip`	`html`, `markdown`, `markdown_zip`	`convert_engine`, `translator_config`
TXTWorkflow	`.txt`	`txt`, `html`	`txt`, `html`	`translator_config`
JsonWorkflow	`.json`	`json`, `html`	`json`, `html`	`translator_config`, `json_paths`
DocxWorkflow	`.docx`	`docx`, `html`	`docx`, `html`	`translator_config`, `insert_mode`
XlsxWorkflow	`.xlsx`, `.csv`	`xlsx`, `html`	`xlsx`, `html`	`translator_config`, `insert_mode`
SrtWorkflow	`.srt`	`srt`, `html`	`srt`, `html`	`translator_config`
EpubWorkflow	`.epub`	`epub`, `html`	`epub`, `html`	`translator_config`, `insert_mode`
HtmlWorkflow	`.html`, `.htm`	`html`	`html`	`translator_config`, `insert_mode`
AssWorkflow	`.ass`	`ass`, `html`	`ass`, `html`	`translator_config`

Key Configuration Options

Common TranslatorConfig Options:

Option	Type	Default	Description
`base_url`	`str`	-	AI platform base URL
`api_key`	`str`	-	AI platform API key
`model_id`	`str`	-	Model ID
`to_lang`	`str`	-	Target language
`chunk_size`	`int`	3000	Text chunk size
`concurrent`	`int`	10	Concurrent requests
`temperature`	`float`	0.3	LLM temperature
`timeout`	`int`	60	Request timeout (seconds)
`retry`	`int`	3	Retry attempts

Format-Specific Options:

Option	Applicable Workflows	Description
`insert_mode`	Docx, Xlsx, Html, Epub	`"replace"` (default), `"append"`, `"prepend"`
`json_paths`	Json	JSONPath expressions (e.g., `["$.*", "$.name"]`)
`separator`	Docx, Xlsx, Html, Epub	Text separator for append/prepend modes
`convert_engine`	MarkdownBased	`"mineru"` (default), `"docling"`, `"mineru_deploy"`

Example 1: Translate a PDF File (Using `MarkdownBasedWorkflow`)

This is the most common use case. We will use the minerU engine to convert the PDF to Markdown, and then translate it using an LLM. This example uses asynchronous execution.

import asyncio
from docutranslate.workflow.md_based_workflow import MarkdownBasedWorkflow, MarkdownBasedWorkflowConfig
from docutranslate.converter.x2md.converter_mineru import ConverterMineruConfig
from docutranslate.translator.ai_translator.md_translator import MDTranslatorConfig
from docutranslate.exporter.md.md2html_exporter import MD2HTMLExporterConfig


async def main():
    # 1. Build Translator Configuration
    translator_config = MDTranslatorConfig(
        base_url="https://open.bigmodel.cn/api/paas/v4",  # AI Platform Base URL
        api_key="YOUR_ZHIPU_API_KEY",  # AI Platform API Key
        model_id="glm-4-air",  # Model ID
        to_lang="English",  # Target Language
        chunk_size=3000,  # Text chunk size
        concurrent=10,  # Concurrency level
        # glossary_generate_enable=True, # Enable auto-glossary generation
        # glossary_dict={"Jobs":"Steve Jobs"}, # Pass in a glossary dictionary
        # system_proxy_enable=True, # Enable system proxy
    )

    # 2. Build Converter Configuration (Using minerU)
    converter_config = ConverterMineruConfig(
        mineru_token="YOUR_MINERU_TOKEN",  # Your minerU Token
        formula_ocr=True  # Enable formula recognition
    )

    # 3. Build Main Workflow Configuration
    workflow_config = MarkdownBasedWorkflowConfig(
        convert_engine="mineru",  # Specify parsing engine
        converter_config=converter_config,  # Pass converter config
        translator_config=translator_config,  # Pass translator config
        html_exporter_config=MD2HTMLExporterConfig(cdn=True)  # HTML export config
    )

    # 4. Instantiate Workflow
    workflow = MarkdownBasedWorkflow(config=workflow_config)

    # 5. Read file and execute translation
    print("Starting to read and translate file...")
    workflow.read_path("path/to/your/document.pdf")
    await workflow.translate_async()
    # Or use synchronous method
    # workflow.translate()
    print("Translation complete!")

    # 6. Save results
    workflow.save_as_html(name="translated_document.html")
    workflow.save_as_markdown_zip(name="translated_document.zip")
    workflow.save_as_markdown(name="translated_document.md")  # Markdown with embedded images
    print("Files saved to ./output folder.")

    # Or get content strings directly
    html_content = workflow.export_to_html()
    html_content = workflow.export_to_markdown()
    # print(html_content)


if __name__ == "__main__":
    asyncio.run(main())

Other Workflows

All workflows follow the same pattern. Import the corresponding config and workflow, then configure:

# TXT: from docutranslate.workflow.txt_workflow import TXTWorkflow, TXTWorkflowConfig
# JSON: from docutranslate.workflow.json_workflow import JsonWorkflow, JsonWorkflowConfig
# DOCX: from docutranslate.workflow.docx_workflow import DocxWorkflow, DocxWorkflowConfig
# XLSX: from docutranslate.workflow.xlsx_workflow import XlsxWorkflow, XlsxWorkflowConfig
# EPUB: from docutranslate.workflow.epub_workflow import EpubWorkflow, EpubWorkflowConfig
# HTML: from docutranslate.workflow.html_workflow import HtmlWorkflow, HtmlWorkflowConfig
# SRT:  from docutranslate.workflow.srt_workflow import SrtWorkflow, SrtWorkflowConfig
# ASS:   from docutranslate.workflow.ass_workflow import AssWorkflow, AssWorkflowConfig

Key config options:

insert_mode: "replace", "append", or "prepend" (for docx/xlsx/html/epub)
json_paths: JSONPath expressions for JSON translation (e.g., ["$.*", "$.name"])
separator: Text separator for "append" / "prepend" modes

Prerequisites and Detailed Configuration

1. Get Large Model API Key

Translation functionality relies on Large Language Models. You need to obtain a base_url, api_key, and model_id from the corresponding AI platform.

Recommended Models: Volcengine's doubao-seed-1-6-flash, doubao-seed-1-6 series, Zhipu's glm-4-flash, Alibaba Cloud's qwen-plus, qwen-flash, Deepseek's deepseek-chat, etc.

302.AI 👈 Register via this link to get $1 free credit.

Platform Name	Get API Key	Base URL
ollama		http://127.0.0.1:11434/v1
lm studio		http://127.0.0.1:1234/v1
302.AI	Click to Get	https://api.302.ai/v1
openrouter	Click to Get	https://openrouter.ai/api/v1
openai	Click to Get	https://api.openai.com/v1/
gemini	Click to Get	https://generativelanguage.googleapis.com/v1beta/openai/
deepseek	Click to Get	https://api.deepseek.com/v1
Zhipu AI	Click to Get	https://open.bigmodel.cn/api/paas/v4
Tencent Hunyuan	Click to Get	https://api.hunyuan.cloud.tencent.com/v1
Alibaba Bailian	Click to Get	https://dashscope.aliyuncs.com/compatible-mode/v1
Volcengine	Click to Get	https://ark.cn-beijing.volces.com/api/v3
SiliconFlow	Click to Get	https://api.siliconflow.cn/v1
DMXAPI	Click to Get	https://www.dmxapi.cn/v1
Juguang AI	Click to Get	https://ai.juguang.chat/v1

2. PDF Parsing Engine (Skip if you don't need to translate PDFs)

2.1 Get minerU Token (Online PDF Parsing, Free, Recommended)

If you choose mineru as the document parsing engine (convert_engine="mineru"), you need to apply for a free Token.

Visit minerU Website to register and apply for the API.
Create a new API Token in the API Token Management Interface.

Note: The minerU Token is valid for 14 days. Please recreate it after expiration.

2.2. docling Engine Configuration (Local PDF Parsing)

If you choose docling as the document parsing engine (convert_engine="docling"), it will download the required models from Hugging Face upon first use.

A better option is to download docling_artifact.zip from GitHub Releases and unzip it into your working directory.

Solutions for docling Model Download Network Issues:

Set Hugging Face Mirror (Recommended):

Method A (Environment Variable): Set the system environment variable HF_ENDPOINT and restart your IDE or terminal.
```
HF_ENDPOINT=https://hf-mirror.com
```
Method B (In Code): Add the following code at the beginning of your Python script.

import os

os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

Offline Use (Pre-download Model Package):

Download docling_artifact.zip from GitHub Releases.
Unzip it into your project directory.
Specify the model path in the configuration (if the model is not in the same directory as the script):

from docutranslate.converter.x2md.converter_docling import ConverterDoclingConfig

converter_config = ConverterDoclingConfig(
    artifact="./docling_artifact",  # Point to the unzipped folder
    code_ocr=True,
    formula_ocr=True
)

2.3. Locally Deployed MinerU Service

For offline/intranet environments, deploy minerU locally with API enabled. Set mineru_deploy_base_url to your minerU API endpoint.

Client SDK:

from docutranslate.sdk import Client

client = Client(
    api_key="YOUR_LLM_API_KEY",
    model_id="llama3",
    to_lang="Chinese",
    convert_engine="mineru_deploy",
    mineru_deploy_base_url="http://127.0.0.1:8000",  # Your minerU API address
)
result = client.translate("document.pdf")
result.save(fmt="markdown")

FAQ

Q: Output is in original language? A: Check logs for errors. Usually due to exhausted API credits or network issues.

Q: Port 8010 occupied? A: Use docutranslate -i -p 8011 or set DOCUTRANSLATE_PORT=8011.

Q: Scanned PDFs supported? A: Yes, use mineru engine with OCR capabilities.

Q: First PDF translation slow? A: docling needs to download models on first run. Use Hugging Face mirror or pre-download artifact.

Q: Use in intranet/offline? A: Yes. Use local LLM (Ollama/LM Studio) and local minerU or docling.

Q: PDF cache mechanism? A: MarkdownBasedWorkflow caches parsing results in memory (last 10 parses). Configure via DOCUTRANSLATE_CACHE_NUM.

Q: Enable proxy? A: Set system_proxy_enable=True in TranslatorConfig.

Star History

Donation Support

Welcome to support the author. Please specify the reason for the donation in the comments!

Donation Code

Project details

Release history Release notifications | RSS feed

1.7.5

Apr 24, 2026

1.7.4

Apr 24, 2026

1.7.3

Apr 19, 2026

1.7.2

Apr 8, 2026

1.7.1.post1

Mar 8, 2026

1.7.1

Mar 7, 2026

1.7.0

Mar 2, 2026

1.7.0a2 pre-release

Feb 25, 2026

1.7.0a1 pre-release

Feb 25, 2026

1.6.3.post1

Jan 19, 2026

1.6.3 yanked

Jan 18, 2026

1.6.2

Jan 11, 2026

1.6.1 yanked

Jan 10, 2026

This version

1.6.0

Dec 31, 2025

1.5.6

Dec 17, 2025

1.5.5

Dec 14, 2025

1.5.4

Dec 12, 2025

1.5.3

Dec 4, 2025

1.5.3a1 pre-release

Dec 2, 2025

1.5.2.post1 yanked

Nov 25, 2025

1.5.2 yanked

Nov 25, 2025

1.5.1

Nov 10, 2025

1.4.18

Nov 3, 2025

1.4.17

Oct 26, 2025

1.4.16.post1

Oct 20, 2025

1.4.16

Oct 20, 2025

1.4.15

Oct 19, 2025

1.4.14

Oct 19, 2025

1.4.13

Oct 18, 2025

1.4.12

Oct 15, 2025

1.4.11

Oct 14, 2025

1.4.10

Oct 13, 2025

1.4.9

Oct 10, 2025

1.4.8

Oct 4, 2025

1.4.7

Sep 29, 2025

1.4.6

Sep 24, 2025

1.4.5

Sep 24, 2025

1.4.5b2 pre-release

Sep 24, 2025

1.4.4

Sep 17, 2025

1.4.3

Sep 9, 2025

1.4.2.post2

Sep 7, 2025

1.4.2.post1

Sep 7, 2025

1.4.2

Sep 6, 2025

1.4.1.post1

Sep 5, 2025

1.4.1

Sep 5, 2025

1.4.0

Sep 4, 2025

1.3.3

Sep 3, 2025

1.3.2

Sep 2, 2025

1.3.2a1 pre-release

Aug 30, 2025

1.3.1

Aug 30, 2025

1.3.0b1 pre-release

Aug 29, 2025

1.2.5

Aug 24, 2025

1.2.4

Aug 23, 2025

1.2.3

Aug 22, 2025

1.2.2

Aug 20, 2025

1.2.1

Aug 20, 2025

1.2.0 yanked

Aug 20, 2025

1.1.6

Aug 18, 2025

1.1.5

Aug 18, 2025

1.1.3

Aug 14, 2025

1.1.1

Aug 9, 2025

1.0.0

Aug 5, 2025

0.3.3

Jul 16, 2025

0.3.2

Jul 16, 2025

0.2.41

Jul 7, 2025

0.2.40

Jul 7, 2025

0.2.39

Jul 3, 2025

0.2.38

Jun 19, 2025

0.2.37

Jun 10, 2025

0.2.36

Jun 10, 2025

0.2.35

Jun 4, 2025

0.2.34

Jun 2, 2025

0.2.31

May 29, 2025

0.2.28

May 26, 2025

0.2.27

May 26, 2025

0.2.25 yanked

May 26, 2025

Reason this release was yanked:

mathjax渲染错误

0.2.23

May 22, 2025

0.2.21

May 20, 2025

0.2.20

May 20, 2025

0.2.19

May 19, 2025

0.2.18

May 19, 2025

0.2.17

May 19, 2025

0.2.16

May 18, 2025

0.2.15

May 18, 2025

0.2.14

May 17, 2025

0.2.13

May 17, 2025

0.2.12

May 17, 2025

0.2.11

May 17, 2025

0.2.10

May 17, 2025

0.2.9

May 16, 2025

0.2.8

May 16, 2025

0.2.7

May 16, 2025

0.2.6

May 14, 2025

0.2.4

May 13, 2025

0.2.3

May 13, 2025

0.2.2.post1

May 12, 2025

0.2.2

May 12, 2025

0.2.1.post1

May 12, 2025

0.2.1

May 12, 2025

0.2.0

May 12, 2025

0.1.8

May 11, 2025

0.1.7

May 11, 2025

0.1.6

May 10, 2025

0.1.5

May 10, 2025

0.1.4

May 10, 2025

0.1.3.post1

May 10, 2025

0.1.3

May 10, 2025

0.1.2

May 10, 2025

0.1.1

May 9, 2025

0.1.0

May 9, 2025

0.0.8

May 8, 2025

0.0.7

May 8, 2025

0.0.6

May 8, 2025

0.0.5

May 8, 2025

0.0.4

May 8, 2025

0.0.3

May 8, 2025

0.0.2

May 8, 2025

0.0.1

May 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docutranslate-1.6.0.tar.gz (3.8 MB view details)

Uploaded Dec 31, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

docutranslate-1.6.0-py3-none-any.whl (4.7 MB view details)

Uploaded Dec 31, 2025 Python 3

File details

Details for the file docutranslate-1.6.0.tar.gz.

File metadata

Download URL: docutranslate-1.6.0.tar.gz
Upload date: Dec 31, 2025
Size: 3.8 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.14

File hashes

Hashes for docutranslate-1.6.0.tar.gz
Algorithm	Hash digest
SHA256	`842b6f79114b4a43802a43e4659948903fe6e26e36ba1ceb5c0bd8c89248a359`
MD5	`3b4055d6ad58554d7129697840409237`
BLAKE2b-256	`508e99d4794c07939cfac976a52e0beb417a006b0d388fc32f743ab7e1cb00c3`

See more details on using hashes here.

File details

Details for the file docutranslate-1.6.0-py3-none-any.whl.

File metadata

Download URL: docutranslate-1.6.0-py3-none-any.whl
Upload date: Dec 31, 2025
Size: 4.7 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.14

File hashes

Hashes for docutranslate-1.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2c5f0ef0e0d70dace1307fb49d365cd9179799ea728f3e5b5a500e92aeb24d14`
MD5	`e867aa1fad901a65d196bd8d29baa3f4`
BLAKE2b-256	`5a4fd31dbbcb8c50019dc015144dbd43dca46418d915f0c988d5312d41c3413e`

See more details on using hashes here.

docutranslate 1.6.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

DocuTranslate

Integration Packages

Installation

Using pip

Using uv

Using git

Using docker

Core Concept: Workflow

Start Web UI and API Service

Usage Examples

Using the Simple Client SDK (Recommended)

Client SDK Parameters

Result Methods

Using Workflow API (For Advanced Control)

Available Workflows and Output Methods

Key Configuration Options

Example 1: Translate a PDF File (Using MarkdownBasedWorkflow)

Other Workflows

Prerequisites and Detailed Configuration

1. Get Large Model API Key

2. PDF Parsing Engine (Skip if you don't need to translate PDFs)

2.1 Get minerU Token (Online PDF Parsing, Free, Recommended)

2.2. docling Engine Configuration (Local PDF Parsing)

2.3. Locally Deployed MinerU Service

FAQ

Star History

Donation Support

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Example 1: Translate a PDF File (Using `MarkdownBasedWorkflow`)