UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
Project description
This is the official repository for UniMERNet, a math recogition model that can be used for image to LaTeX conversion for a wide range of senarios.
Project page: https://gitlab.pjlab.org.cn/fdc/mllm/unimernet
Installation
For Mac
brew install freetype imagemagick
export MAGICK_HOME=/opt/homebrew/opt/imagemagick
Quickstart
Try the Streamlit Demo
Write MER Code in less than 10 lines of code
Training
To train or finetune UniMERNet model, run
torchrun --nproc-per-node 4 --master_port 29500 train.py --cfg-path configs/unimernet_train.yaml
or
bash scripts/train.sh
Evaluation
To evalate the model, run
python test.py --cfg configs/unimernet_eval.yaml
Demo
Image | Recognition Result |
---|---|
$A _ { 4 } = \frac { \mathrm { i } } { 2 } \frac { \nabla \rho } { \rho } \cdot \vec { \tau }$ | |
$\begin{array} { r l } { \left[ \begin{array} { l } { \dot { \theta } _ { i } } \ { \dot { \omega } _ { i } } \end{array} \right] = } & { \left[ \begin{array} { l l } { 0 } & { 1 } \ { 0 } & { 0 } \end{array} \right] \left[ \begin{array} { l } { \theta _ { i } } \ { \omega _ { i } } \end{array} \right] , } \ { \left[ \begin{array} { l } { \dot { p } _ { i } } \ { \dot { v } _ { i } } \end{array} \right] = } & { \left[ \begin{array} { l l } { 0 _ { 2 \times 2 } } & { I _ { 2 } } \ { 0 _ { 2 \times 2 } } & { \omega _ { i } a } \end{array} \right] \left[ \begin{array} { l } { p _ { i } } \ { v _ { i } } \end{array} \right] , } \end{array}$ | |
$\begin{array} { r l } { \mathrm { M i n i m i s e ~ } } & { J ( u . ; s , y ) = \mathbb { E } \left[ \int _ { s } ^ { T } \left( u _ { t } ^ { 2 } + 1 \right) d t - \ln \left( \cosh { ( X _ { T } ) } \right) \right] } \ { \mathrm { s u b j e c t ~ t o ~ } } & { \left{ \begin{array} { l l } { d X _ { t } = 2 u _ { t } d t + \sqrt { 2 } d W _ { t } , t \in [ s , T ] } \ { X _ { s } = y } \ { u _ { t } \in [ - 1 , 1 ] , \quad t \in [ s , T ] } \end{array} \right. } \end{array}$ | |
$\begin{array} { r l } { \left( \widetilde { C } _ { f a c e } \right) _ { C D S } } & { = \left{ \begin{array} { l l } { m i n \left( \frac { \widetilde { C } _ { D } } { C o u } , 1 \right) , } & { \mathrm { ~ 0 ~ \leq ~ \widetilde { C } _ D ~ \leq ~ 1 ~ ; ~ 0 ~ < ~ C o u ~ \leq ~ 1 / 3 ~ } } \ { m i n \left( 3 \widetilde { C } _ { D } , 1 \right) , } & { \mathrm { ~ 0 ~ \leq ~ \widetilde { C } _ D ~ \leq ~ 1 ~ ; ~ 1 / 3 ~ < ~ C o u ~ \leq ~ 1 ~ } } \ { \widetilde { C } _ { D } , } & { \mathrm { ~ \widetilde { C } _ D < 0 ~ ; ~ \widetilde { C } _ D > 1 ~ } } \end{array} \right. } \ { \left( \widetilde { C } _ { f a c e } \right) _ { H R } } & { = \left{ \begin{array} { l l } { 3 \widetilde { C } _ { D } , } & { \mathrm { ~ 0 ~ \leq ~ \widetilde { C } _ D ~ < ~ 1 / 5 ~ } } \ { 0 . 5 + 0 . 5 \widetilde { C } _ { D } , } & { \mathrm { ~ 1 / 5 ~ \leq ~ C _ D ~ < ~ 1 / 2 ~ } } \ { 3 / 8 + 3 / 4 \widetilde { C } _ { D } , } & { \mathrm { ~ 1 / 2 ~ \leq ~ \widetilde { C } _ D ~ < ~ 5 / 6 ~ } } \ { 1 , } & { \mathrm { ~ 5 / 6 ~ \leq ~ \widetilde { C } _ D ~ \leq ~ 1 ~ } } \ { \widetilde { C } _ { D } , } & { \mathrm { ~ \widetilde { C } _ D < 0 ~ ; ~ \widetilde { C } _ D > 1 ~ } } \end{array} \right. } \ { \gamma _ { f a c e } } & { = m i n \left[ \left( c o s \theta \right) ^ { 4 } , 1 \right] } \end{array}$ | |
$\begin{array} { r } { | f ( x , y , z ) - f ( x , y , \bar { z } ) | \leq \sigma ( x , V ( z - \bar { z } ) ) } \end{array}$ | |
$\widehat { \mathbf { K } } ^ { m a t } ! = ! \frac { A E ^ { \sigma T } } { \ell } ! \left[ ! \begin{array} { c c c c } { + 1 } & { 0 } & { - 1 } & { 0 } \ { 0 } & { 0 } & { 0 } & { 0 } \ { - 1 } & { 0 } & { + 1 } & { 0 } \ { 0 } & { 0 } & { 0 } & { 0 } \end{array} ! \right] $ | |
$( x + 2 ) ( x - 2 ) + 2 ( x + 1 ) ( x + 2 ) = - 8 ( x + 1 )$ | |
$0 + ( - 2 + 2 ) ^ { 2 } + ( - 7 - 3 ) ^ { 2 } = 1 6$ |
Citation
If you find our models / code / papers useful in your research, please consider giving ⭐ and citations 📝, thx :)
@article{wang2023vigc,
title={VIGC: Visual Instruction Generation and Correction},
author={Wang, Bin and Wu, Fan and Han, Xiao and Peng, Jiahui and Zhong, Huaping and Zhang, Pan and Dong, Xiaoyi and Li, Weijia and Li, Wei and Wang, Jiaqi and He, Conghui},
journal={arXiv preprint arXiv:2308.12714},
year={2023}
}
Acknowledgement
- VIGC. This repository is built upon VIGC!
- Texify.
- Latex-OCR. The original open source Latex OCR project.
- Donut.
- Nougat.
Contact us
If you have any questions, comments or suggestions, please do not hesitate to contact us at wangbin@pjlab.org.cn.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Close
Hashes for unimernet-0.0.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a9691ab8c0d218c6bffc9f192f145c71bde55a03d9b3a6373293807cba1f7f5f |
|
MD5 | 28abb7124bd3e4ee491c21fa0312ddbe |
|
BLAKE2b-256 | e2a25f1c9bcacdb2cf950cd57d5c71bed17a8de2b730425a5844a9b4c0c8784e |