Skip to main content

Translate math-heavy papers

Project description

MathTranslate

English | 简体中文

This is a project to provide translation of scientific papers with heavy math symbols from any language to any language while keeping the math symbols unchanged. In most translation softwares you wouldn't be able to keep equations and it would annoy you. This project is based on the following two tools:

  1. mathpix: it provides an interface to convert text+equation images to latex code. Unfortunately, it is not totally free. The price can be seen at https://mathpix.com/pricing. In further developments, we will try our best to reduce the number of requests to save your money. (This project itself is 100% free and open-source!)
  2. google translate

The main work of this project is to translate LaTex files based on Google Translate of plain text, with mathpix combined we can finally translate pdf (or other formats) to pdf.

Here's an example of what you get finally.

Although it is currently a small project, we are aware that this project has received much more attention that we expected. We are planning more developments for better user experience.

Releases

Mar 22, 2023

Fixed several main bugs.

Mar 21, 2023

We add tencent translation option for users with IP in China mainland.

Mar 16, 2023

We are now supporting all operating systems! Now you can install simply by pip install --upgrade mathtranslate.

Requirements

  1. A mathpix account. Unfortunately, it is not totally free. The current price is free for 100 screenshots (requires an educational email in registeration) and $5 per month for 5000 screenshots.
  2. Python3 and pip.
  3. texlive (or any other tool to generate pdf from tex). For Chinese you would need CJK package.
  4. (For users with IP address in China mainland): A tencent translation api account. After registering you can get secret ID and secret key at tencent console. Tencent Translate is the translation API with the highest free quota in our knowledge besides Google Translate, with a free quota of 5 million characters per month, and no fee will be deducted if there is no manual recharge (that is, there is no need to worry about misuse).

Installation

pip install --upgrade mathtranslate

Usage

  1. Download mathpix.
  2. (For tencent translation API users) Run translate_tex --setkey to store API ID and key.
  3. Use mathpix to screenshot what you want to translate, copy the output latex code and save in a txt file. Mathpix currently recognizes continuous text (which can be one or more paragraphs). You can also screenshot and copy multiple separated texts and put them in the same txt file, we will automatically identify and merge the paragraphs separated by pictures or pages in the next step.
  4. Assume the filename you saved in the previous step is main.txt. Run translate_tex main.txt. You will get a translated tex file main.tex and a corresponding pdf file main.pdf in case xelatex is installed on your machine.
  5. Since this project is small, sometimes you need to slightly change the final tex file for compilation.
  6. You can change default settings of translation languages and engine by command line argument -engine, -from, -to. For exmample translate_tex -engine tencent main.txt. You can also change setting permanently by translate_tex --setdefault. See more details by translate_tex --help.

Examples

In the example directory, you can see main.txt which is the mathpix output of a part of paper.pdf. Run translate_tex main.txt and you will get the main.tex and main.pdf. translated.png is what you should expect to see in the main.pdf.

Further developments

  1. Automatically extract images from pdf, process images in a batch and output a single translated pdf by one click!
  2. Reduce the number of mathpix requests by open-source techniques.
  3. A more user-friendly interface.

If you have any questions or have interests in making contributions, please contact me by susyustc@gmail.com or joining QQ group 288646946.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mathtranslate-1.3.7.tar.gz (14.7 kB view details)

Uploaded Source

Built Distribution

mathtranslate-1.3.7-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file mathtranslate-1.3.7.tar.gz.

File metadata

  • Download URL: mathtranslate-1.3.7.tar.gz
  • Upload date:
  • Size: 14.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for mathtranslate-1.3.7.tar.gz
Algorithm Hash digest
SHA256 739d5a3a278e2d95d12989d2fb8a86df5258d01974d9c4adc2fa1378ea7b31c6
MD5 a5d79639eaa88047be3753eabd5842ad
BLAKE2b-256 9bf7de3e6a57e436ab89fb031fecc29b0525ba87a35fcae88a11123761f6a640

See more details on using hashes here.

File details

Details for the file mathtranslate-1.3.7-py3-none-any.whl.

File metadata

File hashes

Hashes for mathtranslate-1.3.7-py3-none-any.whl
Algorithm Hash digest
SHA256 e89e8ce44cabbdc1a6a2d4cb17e21e160af52546d1d63c4cb3ac9e091ab0dfd5
MD5 c100dff9cb250529ad50d936287fed50
BLAKE2b-256 193968141387eafb3fcdd5946b6f9b2c56669ff7843aae11e7cdbec3da67a8d6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page