Skip to main content

A High-efficiency Open-source Toolkit for Table-to-Latex Transformation

Project description

StructEqTable-Deploy: A High-efficiency Open-source Toolkit for Table-to-Latex Transformation

[ Paper ] [ Website ] [ Dataset🤗 ] [ Models🤗 ] [ Demo💬 ]

Welcome to the official repository of StructEqTable-Deploy, a solution that converts images of Table into LaTeX/HTML/MarkDown, powered by scalable data from DocGenome benchmark.

Overview

Table is an effective way to represent structured data in scientific publications, financial statements, invoices, web pages, and many other scenarios. Extracting tabular data from a visual table image and performing the downstream reasoning tasks according to the extracted data is challenging, mainly due to that tables often present complicated column and row headers with spanning cell operation. To address these challenges, we present TableX, a large-scale multi-modal table benchmark extracted from DocGenome benchmark for table pre-training, comprising more than 2 million high-quality Image-LaTeX pair data covering 156 disciplinary classes. Besides, benefiting from such large-scale data, we train an end-to-end model, StructEqTable, which provides the capability to precisely obtain the corresponding LaTeX description from a visual table image and perform multiple table-related reasoning tasks, including structural extraction and question answering, broadening its application scope and potential.

Changelog

  • [2024/10/19] 🔥 We have released our latest model StructTable-InternVL2-1B!

    Thanks to IntenrVL2 powerful foundational capabilities, and through fine-tuning on the synthetic tabular data and DocGenome dataset, StructTable can convert table image into various common table formats including LaTeX, HTML, and Markdown. Moreover, inference speed has been significantly improved compared to the v0.2 version.

  • [2024/8/22] We have released our StructTable-base-v0.2, fine-tuned on the DocGenome dataset. This version features improved inference speed and robustness, achieved through data augmentation and reduced image token num.

  • [2024/8/08] We have released the TensorRT accelerated version, which only takes about 1 second for most images on GPU A100. Please follow the tutorial to install the environment and compile the model weights.

  • [2024/7/30] We have released the first version of StructEqTable.

TODO

  • Release inference code and checkpoints of StructEqTable.
  • Support Chinese version of StructEqTable.
  • Accelerated version of StructEqTable using TensorRT-LLM.
  • Expand more domains of table image to improve the model's general capabilities.
  • Efficient inference of StructTable-InternVL2-1B by LMDeploy Tookit.
  • Release our table pre-training and fine-tuning code

Installation

conda create -n structeqtable python>=3.10
conda activate structeqtable

# Install from Source code  (Suggested)
git clone https://github.com/UniModal4Reasoning/StructEqTable-Deploy.git
cd StructEqTable-Deploy
pip install -r requirements.txt
python setup develop

# or Install from Github repo
pip install "git+https://github.com/UniModal4Reasoning/StructEqTable-Deploy.git"

# or Install from PyPI
pip install struct-eqtable --upgrade

Model Zoo

Base Model Model Size Training Data Data Augmentation LMDeploy TensorRT HuggingFace
InternVL2-1B ~1B DocGenome and Synthetic Data StructTable v0.3
Pix2Struct-base ~300M DocGenome StructTable v0.2
Pix2Struct-base ~300M DocGenome StructTable v0.1

Quick Demo

  • Run the demo/demo.py
cd tools/demo

python demo.py \
  --image_path ./demo.png \
  --ckpt_path U4R/StructTable-InternVL2-1B \
  --output_format latex
  • HTML or Markdown format output (Only Supported by StructTable-InternVL2-1B)
python demo.py \
  --image_path ./demo.png \
  --ckpt_path U4R/StructTable-InternVL2-1B \
  --output_format html markdown

Efficient Inference

  • Install LMDeploy Tookit
pip install lmdeploy
  • Run the demo/demo.py
cd tools/demo

python demo.py \
  --image_path ./demo.png \
  --ckpt_path U4R/StructTable-InternVL2-1B \
  --output_format latex \
  --lmdeploy
  • Visualization Result

    You can copy the output LaTeX code into demo.tex, then use Overleaf for table visualization.

Acknowledgements

  • DocGenome. An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models.
  • ChartVLM. A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning.
  • Pix2Struct. Screenshot Parsing as Pretraining for Visual Language Understanding.
  • InternVL Family. A Series of Powerful Foundational Vision-Language Models.
  • LMDeploy. A toolkit for compressing, deploying, and serving LLM and MLLM.
  • UniMERNet. A Universal Network for Real-World Mathematical Expression Recognition.
  • Donut. The UniMERNet's Transformer Encoder-Decoder are referenced from Donut.
  • Nougat. Data Augmentation follows Nougat.
  • TensorRT-LLM. Model inference acceleration uses TensorRT-LLM.

License

StructEqTable is released under the Apache License 2.0

Citation

If you find our models / code / papers useful in your research, please consider giving ⭐ and citations 📝, thx :)

@article{xia2024docgenome,
  title={DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models},
  author={Xia, Renqiu and Mao, Song and Yan, Xiangchao and Zhou, Hongbin and Zhang, Bo and Peng, Haoyang and Pi, Jiahao and Fu, Daocheng and Wu, Wenjie and Ye, Hancheng and others},
  journal={arXiv preprint arXiv:2406.11633},
  year={2024}
}

Contact Us

If you encounter any issues or have questions, please feel free to contact us via zhouhongbin@pjlab.org.cn.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

struct_eqtable-0.3.3.tar.gz (26.4 kB view details)

Uploaded Source

Built Distribution

struct_eqtable-0.3.3-py3-none-any.whl (26.2 kB view details)

Uploaded Python 3

File details

Details for the file struct_eqtable-0.3.3.tar.gz.

File metadata

  • Download URL: struct_eqtable-0.3.3.tar.gz
  • Upload date:
  • Size: 26.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.18

File hashes

Hashes for struct_eqtable-0.3.3.tar.gz
Algorithm Hash digest
SHA256 9564ea5ecb5aaf2cfc3eb55268f17a95f8e3243aa39e0a7b15ce7bdaa370bc8a
MD5 0c1e4653a015fdf80bf3cd9c8d48d855
BLAKE2b-256 478712ee5044432a3d832f337131192b0605d308dde4bd5baa24c912e3eac035

See more details on using hashes here.

File details

Details for the file struct_eqtable-0.3.3-py3-none-any.whl.

File metadata

File hashes

Hashes for struct_eqtable-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 262eb3439cd5afdc523dccb7d7a1959a79fb1b9983296ce513a4da81a2b4d530
MD5 6ef1541dd5af6f61da0f79bf36a01a2a
BLAKE2b-256 ac70240d7d010ca1e714170134cc4fda3124fc869e9b6c3b3401672a0ff29edf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page