Skip to main content

An Open-Source Python3 tool for recognizing layouts, tables, math formulas, and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations.

Project description

中文 | English

Pix2Text

Update 2024.11.17: V1.1.2 Released

Major Changes:

  • A new layout analysis model DocLayout-YOLO has been integrated, improving the accuracy of layout analysis.

Update 2024.06.18:V1.1.1 Released

Major changes:

  • Support the new mathematical formula detection models (MFD): breezedeus/pix2text-mfd (Mirror), which significantly improves the accuracy of formula detection.

See details: Pix2Text V1.1.1 Released, Bringing Better Mathematical Formula Detection Models | Breezedeus.com.

Update 2024.04.28: V1.1 Released

Major changes:

Update 2024.02.26: V1.0 Released

Main Changes:

See more at: RELEASE.md .


Pix2Text (P2T) aims to be a free and open-source Python alternative to Mathpix, and it can already accomplish Mathpix's core functionality. Pix2Text (P2T) can recognize layouts, tables, images, text, mathematical formulas, and integrate all of these contents into Markdown format. P2T can also convert an entire PDF file (which can contain scanned images or any other format) into Markdown format.

Pix2Text (P2T) integrates the following models:

Several models are contributed by other open-source authors, and their contributions are highly appreciated.

Pix2Text Arch Flow

For detailed explanations, please refer to the Pix2Text Online Documentation/Models.


As a Python3 toolkit, P2T may not be very user-friendly for those who are not familiar with Python. Therefore, we also provide a free-to-use P2T Online Web, where you can directly upload images and get P2T parsing results. The web version uses the latest models, resulting in better performance compared to the open-source models.

If you're interested, feel free to add the assistant as a friend by scanning the QR code and mentioning p2t. The assistant will regularly invite everyone to join the group where the latest developments related to P2T tools will be announced:

Wechat-QRCode

The author also maintains a Knowledge Planet P2T/CnOCR/CnSTD Private Group, where questions are answered promptly. You're welcome to join. The knowledge planet private group will also gradually release some private materials related to P2T/CnOCR/CnSTD, including some unreleased models, discounts on purchasing premium models, code snippets for different application scenarios, and answers to difficult problems encountered during use. The planet will also publish the latest research materials related to P2T/OCR/STD.

For more contact method, please refer to Contact.

List of Supported Languages

The text recognition engine of Pix2Text supports 80+ languages, including English, Simplified Chinese, Traditional Chinese, Vietnamese, etc. Among these, English and Simplified Chinese recognition utilize the open-source OCR tool CnOCR, while recognition for other languages employs the open-source OCR tool EasyOCR. Special thanks to the respective authors.

List of Supported Languages and Language Codes are shown below:

↓↓↓ Click to show details ↓↓↓
Language Code Name
Abaza abq
Adyghe ady
Afrikaans af
Angika ang
Arabic ar
Assamese as
Avar ava
Azerbaijani az
Belarusian be
Bulgarian bg
Bihari bh
Bhojpuri bho
Bengali bn
Bosnian bs
Simplified Chinese ch_sim
Traditional Chinese ch_tra
Chechen che
Czech cs
Welsh cy
Danish da
Dargwa dar
German de
English en
Spanish es
Estonian et
Persian (Farsi) fa
French fr
Irish ga
Goan Konkani gom
Hindi hi
Croatian hr
Hungarian hu
Indonesian id
Ingush inh
Icelandic is
Italian it
Japanese ja
Kabardian kbd
Kannada kn
Korean ko
Kurdish ku
Latin la
Lak lbe
Lezghian lez
Lithuanian lt
Latvian lv
Magahi mah
Maithili mai
Maori mi
Mongolian mn
Marathi mr
Malay ms
Maltese mt
Nepali ne
Newari new
Dutch nl
Norwegian no
Occitan oc
Pali pi
Polish pl
Portuguese pt
Romanian ro
Russian ru
Serbian (cyrillic) rs_cyrillic
Serbian (latin) rs_latin
Nagpuri sck
Slovak sk
Slovenian sl
Albanian sq
Swedish sv
Swahili sw
Tamil ta
Tabassaran tab
Telugu te
Thai th
Tajik tjk
Tagalog tl
Turkish tr
Uyghur ug
Ukranian uk
Urdu ur
Uzbek uz
Vietnamese vi

Ref: Supported Languages .

Online Service

Everyone can use the P2T Online Service for free, with a daily limit of 10,000 characters per account, which should be sufficient for normal use. Please refrain from bulk API calls, as machine resources are limited, and this could prevent others from accessing the service.

Due to hardware constraints, the Online Service currently only supports Simplified Chinese and English languages. To try the models in other languages, please use the following Online Demo.

Online Demo 🤗

You can also try the Online Demo to see the performance of P2T in various languages. However, the online demo operates on lower hardware specifications and may be slower. For Simplified Chinese or English images, it is recommended to use the P2T Online Service.

Examples

See: Pix2Text Online Documentation/Examples.

Usage

See: Pix2Text Online Documentation/Usage.

Models

See: Pix2Text Online Documentation/Models.

Install

Well, one line of command is enough if it goes well.

pip install pix2text

If you need to recognize languages other than English and Simplified Chinese, please use the following command to install additional packages:

pip install pix2text[multilingual]

If the installation is slow, you can specify an installation source, such as using the Aliyun source:

pip install pix2text -i https://mirrors.aliyun.com/pypi/simple

For more information, please refer to: Pix2Text Online Documentation/Install.

Command Line Tool

See: Pix2Text Online Documentation/Command Tool.

HTTP Service

See: Pix2Text Online Documentation/Command Tool/Start Service.

MacOS Desktop Application

Please refer to Pix2Text-Mac for installing the Pix2Text Desktop App for MacOS.

Pix2Text Mac App

A cup of coffee for the author

It is not easy to maintain and evolve the project, so if it is helpful to you, please consider offering the author a cup of coffee 🥤.


Official code base: https://github.com/breezedeus/pix2text. Please cite it properly.

For more information on Pix2Text (P2T), visit: https://www.breezedeus.com/article/pix2text.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pix2text-1.1.2.tar.gz (189.8 kB view details)

Uploaded Source

Built Distribution

pix2text-1.1.2-py3-none-any.whl (215.7 kB view details)

Uploaded Python 3

File details

Details for the file pix2text-1.1.2.tar.gz.

File metadata

  • Download URL: pix2text-1.1.2.tar.gz
  • Upload date:
  • Size: 189.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.10

File hashes

Hashes for pix2text-1.1.2.tar.gz
Algorithm Hash digest
SHA256 09709d1ce2ebc313e3131c26921370290d687d960d424f9c3118c11546694e43
MD5 5d12e8c4a16c3abd76e938ea12b57258
BLAKE2b-256 c8a927571cdf2c8f67962b835de358371d6378dee8e9ed77fe378b11a889d1c1

See more details on using hashes here.

File details

Details for the file pix2text-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: pix2text-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 215.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.10

File hashes

Hashes for pix2text-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c3bbf4c0ed9148bbb7f713ef093119f7f6d02c1ddf4f895c79b3be8a44d7a3b1
MD5 f2cebf38cfa844b7b771e9f2407996bd
BLAKE2b-256 663a67326f513fc292d17a6ae32210bd6a96ae98ab4c2aa6f12eef23d74622d3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page