Skip to main content

Latex PDF Translator

Project description

English | 简体中文

PDF2ZH

PDFMathTranslate

PDF scientific paper translation and bilingual comparison.

Feel free to provide feedback in GitHub Issues or Telegram Group.

Updates

Preview

Demo 🌟

You can try our demo on HuggingFace without installation.
Note that the computing resources of the demo are limited, so please avoid abusing them.

Installation and Usage

We provide three methods for using this project: commanline, GUI, and Docker.

Method I. Commandline

  1. Python installed (3.8 <= version <= 3.12)

  2. Install our package

    pip install pdf2zh
    
  3. Use:

    pdf2zh document.pdf
    

Method II. GUI

  1. Python installed (3.8 <= version <= 3.12)

  2. Install our package

    pip install pdf2zh
    
  3. Start using in browser:

    pdf2zh -i
    
  4. If your browswer has not been started automatically, goto

    http://localhost:7860/
    

See documentation for GUI for more details.

Method III. Docker

  1. Pull and run:

    docker pull byaidu/pdf2zh
    docker run -p 7860:7860 byaidu/pdf2zh
    
  2. Open in browser:

    http://localhost:7860/
    

For docker deployment on cloud service:

Deploy Deploy to Koyeb Deploy on Zeabur Deploy to Koyeb

Advanced Options

Execute the translation command in the command line to generate the translated document example-zh.pdf and the bilingual document example-dual.pdf in the current directory. Use Google as the default translation service.

cmd

In the following table, we list all advanced options for reference:

Option Function Example
-i Enter GUI pdf2zh -i
-p Partial document translation pdf2zh example.pdf -p 1
-li Source language pdf2zh example.pdf -li en
-lo Target language pdf2zh example.pdf -lo zh
-s Translation service pdf2zh example.pdf -s deepl
-t Multi-threads pdf2zh example.pdf -t 1
-f, -c Exceptions pdf2zh example.pdf -f "(MS.*)"

Some services require setting environmental variables. Please refer to ChatGPT for how to set environment variables.

Full / partial document translation

  • Entire document

    pdf2zh example.pdf
    
  • Part of the document

    pdf2zh example.pdf -p 1-3,5
    

Specify source and target languages

See Google Languages Codes, DeepL Languages Codes

pdf2zh example.pdf -li en -lo ja

Translate with Different Services

  • DeepL

    See DeepL

    Set ENVs to construct an endpoint like: {DEEPL_SERVER_URL}/translate

    • DEEPL_SERVER_URL (Optional), e.g., export DEEPL_SERVER_URL=https://api.deepl.com
    • DEEPL_AUTH_KEY, e.g., export DEEPL_AUTH_KEY=xxx
    pdf2zh example.pdf -s deepl
    
  • DeepLX

    See DeepLX

    Set ENVs to construct an endpoint like: {DEEPL_SERVER_URL}/translate

    • DEEPLX_SERVER_URL (Optional), e.g., export DEEPLX_SERVER_URL=https://api.deeplx.org
    • DEEPLX_AUTH_KEY, e.g., export DEEPLX_AUTH_KEY=xxx
    pdf2zh example.pdf -s deeplx
    
  • Ollama

    See Ollama

    Set ENVs to construct an endpoint like: {OLLAMA_HOST}/api/chat

    • OLLAMA_HOST (Optional), e.g., export OLLAMA_HOST=https://localhost:11434
    pdf2zh example.pdf -s ollama:gemma2
    
  • LLM with OpenAI compatible schemas (OpenAI / SiliconCloud / Zhipu)

    See SiliconCloud, Zhipu

    Set ENVs to construct an endpoint like: {OPENAI_BASE_URL}/chat/completions

    • OPENAI_BASE_URL (Optional), e.g., export OPENAI_BASE_URL=https://api.openai.com/v1
    • OPENAI_API_KEY, e.g., export OPENAI_API_KEY=xxx
    pdf2zh example.pdf -s openai:gpt-4o
    
  • Azure

    See Azure Text Translation

    Following ENVs are required:

    • AZURE_APIKEY, e.g., export AZURE_APIKEY=xxx
    • AZURE_ENDPOINT, e.g, export AZURE_ENDPOINT=https://api.translator.azure.cn/
    • AZURE_REGION, e.g., export AZURE_REGION=chinaeast2
    pdf2zh example.pdf -s azure
    

Translate wih exceptions

Use regex to specify formula fonts and characters that need to be preserved.

pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"

Specify threads

Use -t to specify how many threads to use in translation:

pdf2zh example.pdf -t 1

Acknowledgements

Contributors

Star History

Star History Chart

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf2zh-1.7.8.tar.gz (153.4 kB view details)

Uploaded Source

Built Distribution

pdf2zh-1.7.8-py3-none-any.whl (159.7 kB view details)

Uploaded Python 3

File details

Details for the file pdf2zh-1.7.8.tar.gz.

File metadata

  • Download URL: pdf2zh-1.7.8.tar.gz
  • Upload date:
  • Size: 153.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for pdf2zh-1.7.8.tar.gz
Algorithm Hash digest
SHA256 40d62c93c0379dfb2b381deecbbde8e9ea73f7012462ccfefe39551c80696fa8
MD5 3ebfc1abe663b6ea7f2df0caf585de48
BLAKE2b-256 bf1e148d2eb134934b96fa839885a70a98fd13c6ed703f51bd40695d9716df33

See more details on using hashes here.

File details

Details for the file pdf2zh-1.7.8-py3-none-any.whl.

File metadata

  • Download URL: pdf2zh-1.7.8-py3-none-any.whl
  • Upload date:
  • Size: 159.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for pdf2zh-1.7.8-py3-none-any.whl
Algorithm Hash digest
SHA256 7e43d1b8e3af439ef88e1f5e3ac70ae4c53cf3cfdcad53c4f99cda932753e0f2
MD5 88a7168510a0051544c3e2bf4f9429eb
BLAKE2b-256 79b1f4449097383b063581895c35f9dca70cfc4fb72d17cc8b1b27a4c3cda50d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page