Skip to main content

Layout Parser is a deep learning assisted tool for Document Image Layout Analysis.

Project description

Layout Parser Logo

A unified toolkit for Deep Learning Based Document Image Analysis

Python 3.6 3.7 3.8 PyPI - Downloads


Installation

You can find detailed installation instructions in installation.md. But generally, it's just pip install some libraries:

pip install -U layoutparser

# Install Detectron2 for using DL Layout Detection Model
# Please make sure the PyTorch version is compatible with
# the installed Detectron2 version. 
pip install 'git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2' 

# Install the ocr components when necessary 
pip install layoutparser[ocr]      

For Windows Users: Please read installation.md for details about installing Detectron2.

Quick Start

We provide a series of examples for to help you start using the layout parser library:

  1. Table OCR and Results Parsing: layoutparser can be used for conveniently OCR documents and convert the output in to structured data.

  2. Deep Layout Parsing Example: With the help of Deep Learning, layoutparser supports the analysis very complex documents and processing of the hierarchical structure in the layouts.

DL Assisted Layout Prediction Example

Example Usage

The images shown in the figure above are: a screenshot of this paper, an image from the PRIMA Layout Analysis Dataset, a screenshot of the WSJ website, and an image from the HJDataset.

With only 4 lines of code in layoutparse, you can unlock the information from complex documents that existing tools could not provide. You can either choose a deep learning model from the ModelZoo, or load the model that you trained on your own. And use the following code to predict the layout as well as visualize it:

>>> import layoutparser as lp
>>> model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config')
>>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...)
>>> lp.draw_box(image, layout,) # With extra configurations

Contributing

We encourage you to contribute to Ruby on Rails! Please check out the Contributing guidelines for guidelines about how to proceed. Join us!

Citing layoutparser

If you find layoutparser helpful to your work, please consider citing our tool and paper using the following BibTeX entry.

@article{shen2021layoutparser,
  title={LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis},
  author={Shen, Zejiang and Zhang, Ruochen and Dell, Melissa and Lee, Benjamin Charles Germain and Carlson, Jacob and Li, Weining},
  journal={arXiv preprint arXiv:2103.15348},
  year={2021}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

layoutparser-0.2.0.tar.gz (19.1 MB view details)

Uploaded Source

Built Distribution

layoutparser-0.2.0-py3-none-any.whl (19.1 MB view details)

Uploaded Python 3

File details

Details for the file layoutparser-0.2.0.tar.gz.

File metadata

  • Download URL: layoutparser-0.2.0.tar.gz
  • Upload date:
  • Size: 19.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for layoutparser-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e9489f75d9b0282ca3601100e2c9829c0aad9b50bcf3f5918d1f180e6f4fc196
MD5 33a4793a9f692c77fa1b891c4ec5fdd9
BLAKE2b-256 8531e6ed896ab75cecf563ff8faf5bb23f73029d8002df5922c4d90a7caf449a

See more details on using hashes here.

File details

Details for the file layoutparser-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: layoutparser-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 19.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for layoutparser-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a0141267e669790736a1ac8c5b305f1e4b61b2c434eb86a86435ca86d7eca481
MD5 bf8199d1a7c7107ae918d25c25226085
BLAKE2b-256 fe5aaab0b77f223ab2158c9f44a17dc9f68f6b94c34f5672ca54ef5842fb8dbe

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page