Skip to main content

Table image parsing with cell detection models

Project description

cells2table

Parsing tables in document images with cell detection models

Implemented pipelines

PaddlePaddle

  • Classification model (wired / wireless)
  • Cell detection model with different weights for each class

Uses ONNX weights downloaded automatically from Hugging Face on first use.

Instalation

With uv, add to your project with:

uv add cells2table

ONNX models need a ONNX Runtime installed to run. You can install one on your own or use one of the optionals already configured.

Optional Description
docling For docling usage
huggingface For downloading models
onnx_cuda For NVIDIA GPUs
onnx_openvino For Intel GPUs and CPUs
onnx_cpu Default CPU runtime

Usage

cells2table only extract structural information from the tables. Another library is needed to extract content from the cells.

Docling

A docling plugin is provided to allow integrating cells2table in a complete pipeline.

Usage example:

from cells2table.docling import CustomDoclingTableStructureOptions

pipeline_options = PdfPipelineOptions(
    allow_external_plugins=True,
    table_structure_options=CustomDoclingTableStructureOptions(),
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options),
        InputFormat.IMAGE: PdfFormatOption(pipeline_options=pipeline_options),
    }
)

result = converter.convert("path/to/document.pdf")
print(result.document.export_to_markdown())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cells2table-0.2.3.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cells2table-0.2.3-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file cells2table-0.2.3.tar.gz.

File metadata

  • Download URL: cells2table-0.2.3.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cells2table-0.2.3.tar.gz
Algorithm Hash digest
SHA256 00d98e5f726343476bce0b012a968c6962a95019e3a6d66f7c1e0b7ae7cf5a5f
MD5 f05bc5f075c137929233bbe06ba0f709
BLAKE2b-256 f892569fce691367a2c6a9966cb6161e07d2966499821abf233d15926c563536

See more details on using hashes here.

File details

Details for the file cells2table-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: cells2table-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 19.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cells2table-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 fb220e933bd9f862c6231dbc215c7e7e00514403c97445b346e60c8d0f325758
MD5 e6efb11d91c950480ed8c9a9c86ea17c
BLAKE2b-256 68b3872449ebe9cfc5ffd30038317ee1dd6ee69608541d4bee1330631f9bfbf0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page