Skip to main content

Layout Parser is a deep learning assisted tool for Document Image Layout Analysis.

Project description

Layout Parser Logo

Docs PyPI PyVersion License


Layout Parser is deep learning based tool for document image layout analysis tasks.

Installation

Use pip or conda to install the library:

pip install layoutparser

# Install Detectron2 for using DL Layout Detection Model
# Please make sure the PyTorch version is compatible with
# the installed Detectron2 version. 
pip install 'git+https://github.com/facebookresearch/detectron2.git#egg=detectron2' 

# Install the ocr components when necessary 
pip install layoutparser[ocr]      

This by default will install the CPU version of the Detectron2, and it should be able to run on most of the computers. But if you have a GPU, you can consider the GPU version of the Detectron2, referring to the official instructions.

Quick Start

We provide a series of examples for to help you start using the layout parser library:

  1. Table OCR and Results Parsing: layoutparser can be used for conveniently OCR documents and convert the output in to structured data.

  2. Deep Layout Parsing Example: With the help of Deep Learning, layoutparser supports the analysis very complex documents and processing of the hierarchical structure in the layouts.

DL Assisted Layout Prediction Example

Example Usage

The images shown in the figure above are: a screenshot of this paper, an image from the PRIMA Layout Analysis Dataset, a screenshot of the WSJ website, and an image from the HJDataset.

With only 4 lines of code in layoutparse, you can unlock the information from complex documents that existing tools could not provide. You can either choose a deep learning model from the ModelZoo, or load the model that you trained on your own. And use the following code to predict the layout as well as visualize it:

>>> import layoutparser as lp
>>> model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config')
>>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...)
>>> lp.draw_box(image, layout,) # With extra configurations

Citing layoutparser

If you find layoutparser helpful to your work, please consider citing our tool using the following BibTeX entry.

@misc{shen2020layoutparser,
  author = {Zejiang Shen and Ruochen Zhang and Melissa Dell},
  title = {LayoutParser},
  howpublished = {\url{https://github.com/Layout-Parser/layout-parser}},
  year = {2020}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

layoutparser-0.1.2.tar.gz (19.1 MB view details)

Uploaded Source

Built Distribution

layoutparser-0.1.2-py3-none-any.whl (19.1 MB view details)

Uploaded Python 3

File details

Details for the file layoutparser-0.1.2.tar.gz.

File metadata

  • Download URL: layoutparser-0.1.2.tar.gz
  • Upload date:
  • Size: 19.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.0

File hashes

Hashes for layoutparser-0.1.2.tar.gz
Algorithm Hash digest
SHA256 81d3fb4390bfdf644dff9d132e3f6496945a15cad10789a55f6cade30d7e4afb
MD5 84039e7076cf0249c3d5a14567621792
BLAKE2b-256 dd1a9ffd8d65f1df7f3da8cb5f77fe5e055e3bd2a93d565119067d85315daf1a

See more details on using hashes here.

File details

Details for the file layoutparser-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: layoutparser-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 19.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.0

File hashes

Hashes for layoutparser-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c22f1224c08508f910e7fc5d72e4a80cbb72b979b905d5350951b0ef09cdfdb5
MD5 9e91327152700464677790ba8c929bb9
BLAKE2b-256 c021d02e73cd77aa1f60c63fd3a55e101f1ed004c20c07108eaa186e9efbb7d0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page