Skip to main content

Table image parsing with cell detection models

Project description

cells2table

Parsing tables in document images with cell detection models

Implemented pipelines

PaddlePaddle models

  • Classification model (wired / wireless)
  • Cell detection model with different weights for each class

Using ONNX weights (downloaded automatically on first use with huggingface_hub)

Instalation

With uv, add to your project with:

uv add cells2table

ONNX models need a ONNX Runtime installed to run. You can install one on your own or use one of the optionals already configured.

Optional Description
cuda For NVIDIA GPUs
openvino For Intel GPUs and CPUs
cpu Default CPU runtime
docling For docling usage

Usage

cells2table only extract structural information from the tables. Another library is needed to extract content from the cells.

Docling

A docling plugin is provided to allow integrating cells2table in a complete pipeline.

Usage example:

from cells2table.docling import CustomDoclingTableStructureOptions

pipeline_options = PdfPipelineOptions(
    allow_external_plugins=True,
    table_structure_options=CustomDoclingTableStructureOptions(),
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options),
        InputFormat.IMAGE: PdfFormatOption(pipeline_options=pipeline_options),
    }
)

result = converter.convert("path/to/document.pdf")
print(result.document.export_to_markdown())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cells2table-0.2.1.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cells2table-0.2.1-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file cells2table-0.2.1.tar.gz.

File metadata

  • Download URL: cells2table-0.2.1.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cells2table-0.2.1.tar.gz
Algorithm Hash digest
SHA256 af8eb4c0058c9b75c7cf13256ef7349343cc85331e13fe2b59d6845a11bac3ce
MD5 f68ccf03623f70f6362e988b081863bd
BLAKE2b-256 53323538a72e0806e4a30b3d208c140f3b27d711baf7c58aca9165b94528f160

See more details on using hashes here.

File details

Details for the file cells2table-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: cells2table-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cells2table-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5ba398c73356d6293a460a2d352fd138c7a373fccbd2204d67cfcfc28883105f
MD5 beae0056b075ab39d21d9a41dc220430
BLAKE2b-256 67f1d04938555fab1ba1514e74bc22fc6d8b22993f0e0b288cf482568d4d061f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page