An OCR library implemented in MLX

These details have not been verified by PyPI

Project links

Homepage

Project description

MLX-OCR

⚡️ Fast and Efficient OCR Library using Apple MLX Framework 🍎

🚀 Quick Start

Install the library using pip:

pip install mlx-ocr

Then, badabim badabum, you can use the library like this:

from mlx_ocr import MLXOCR

ocr = MLXOCR(det_lang="eng", rec_lang="eng")
img = "path/to/image.jpg"
result = ocr(img)
print(result)

Check out the examples directory for more usage examples!

💡 Current Models & Future Models

Current Models (PP-OCRv3):

Currently, mlx-ocr implements the detection and recognition models from the student version of PP-OCRv3. These models are known for their efficiency and good performance.

Text Detection: Implemented and ready to use.
Text Recognition: Implemented and ready to use.

As the PP-OCRv3 language models share the same architecture and training configuration, weights should be broadly compatible across different languages. Refer to the PaddleOCR model list for detailed model information.

Future Model Implementations:

The focus will be on expanding model coverage and algorithmic improvements. Planned model additions include:

Angle Classification Model (PP-OCRv3): To enhance accuracy by correcting text orientation.
PP-OCRv4 Models: Exploring implementation of the next generation PP-OCRv4 models for potential performance gains.
Experimentation with other architectures: Investigating and potentially implementing other state-of-the-art OCR models within the MLX framework.

✨ Upcoming Features

[Planned] Beginner-Friendly Fine-tuning Tools: Creating user-friendly tools and guides for fine-tuning models and training new models, making the library more accessible to users with varying levels of expertise.
[Planned] Enhanced Documentation and Examples: Expanding documentation with more detailed explanations, tutorials, and diverse usage examples to improve user onboarding and understanding.

🥲 Disclaimer

Many functionalities are missing, and some parts of the code is not optimized. The models on their own are very performant, but preprocessing and postprocessing needs more work.

People who are used to training models and preparing datasets can easily use the models in this repository. However, I will try to make the library more beginner-friendly in the future, for fine-tuning and training new models.

🙌 Contributing

Contributions are welcome and encouraged! We strive to keep dependencies minimal and appreciate contributions in various forms:

Model Implementations: Help implement new OCR models within the MLX framework.
Algorithmic Improvements: Contribute to pre-processing, post-processing, or other algorithmic enhancements.
Bug Reports: Report any issues or unexpected behavior you encounter.
Feature Requests: Suggest new features or improvements you'd like to see.
Documentation: Help improve documentation, tutorials, and examples.
Code Review: Review and provide feedback on code changes.

The mlx_ocr/models directory is intentionally kept lightweight, depending only on mlx. We use pypdfium2 for PDF operations due to its permissive licensing (Apache 2.0 and BSD 3-Clause).

🙏 Acknowledgements

This project is inspired by and builds upon the excellent work of the PaddleOCR library. Code is adapted and translated (sometimes directly) from PaddleOCR. We are committed to proper citation and will continue to refine attributions. If you notice any missing citations, please let us know.

We also acknowledge inspiration from the mlx-vlm repository for code style and project structure.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.1

Feb 17, 2025

This version

0.2.0

Feb 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_ocr-0.2.0.tar.gz (312.6 kB view details)

Uploaded Feb 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mlx_ocr-0.2.0-py3-none-any.whl (319.5 kB view details)

Uploaded Feb 17, 2025 Python 3

File details

Details for the file mlx_ocr-0.2.0.tar.gz.

File metadata

Download URL: mlx_ocr-0.2.0.tar.gz
Upload date: Feb 17, 2025
Size: 312.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.7

File hashes

Hashes for mlx_ocr-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`96014453c403cd5441a9998ebae748f5cd115f0e589ad76335fbf53dd7ef06d1`
MD5	`18cf434f7794731b3dbff89b9dc1db57`
BLAKE2b-256	`713d8eb157d2ccc18def4871c7b19a423bbb97cb54b6022fdcae5a3a744d5a8e`

See more details on using hashes here.

File details

Details for the file mlx_ocr-0.2.0-py3-none-any.whl.

File metadata

Download URL: mlx_ocr-0.2.0-py3-none-any.whl
Upload date: Feb 17, 2025
Size: 319.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.7

File hashes

Hashes for mlx_ocr-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1832728e729c833781f6ecf3116d7943d13107e5d982e8cd670594e8f6ae528f`
MD5	`15921401a9b7317f9c8b853d72523cde`
BLAKE2b-256	`ebefe16fc42bcac7701a62b9f5052d41f2e1fefdb99437af5afa308b2e38cdc6`

See more details on using hashes here.

mlx-ocr 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MLX-OCR

🚀 Quick Start

💡 Current Models & Future Models

Current Models (PP-OCRv3):

Future Model Implementations:

✨ Upcoming Features

🥲 Disclaimer

🙌 Contributing

🙏 Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes