Skip to main content

Table image parsing with cell detection models

Project description

cells2table

Parsing tables in document images with cell detection models

Implemented pipelines

PaddlePaddle

  • Classification model (wired / wireless)
  • Cell detection model with different weights for each class

Uses ONNX weights downloaded automatically from Hugging Face on first use.

Instalation

With uv, add to your project with:

uv add cells2table

ONNX models need a ONNX Runtime installed to run. You can install one on your own or use one of the optionals already configured.

Optional Description
docling For docling usage
huggingface For downloading models
onnx_cuda For NVIDIA GPUs
onnx_openvino For Intel GPUs and CPUs
onnx_cpu Default CPU runtime

Usage

cells2table only extract structural information from the tables. Another library is needed to extract content from the cells.

Docling

A docling plugin is provided to allow integrating cells2table in a complete pipeline.

Usage example:

from cells2table.docling import CustomDoclingTableStructureOptions

pipeline_options = PdfPipelineOptions(
    allow_external_plugins=True,
    table_structure_options=CustomDoclingTableStructureOptions(),
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options),
        InputFormat.IMAGE: PdfFormatOption(pipeline_options=pipeline_options),
    }
)

result = converter.convert("path/to/document.pdf")
print(result.document.export_to_markdown())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cells2table-0.2.2.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cells2table-0.2.2-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file cells2table-0.2.2.tar.gz.

File metadata

  • Download URL: cells2table-0.2.2.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cells2table-0.2.2.tar.gz
Algorithm Hash digest
SHA256 c500edafbf34eab15a2fad9d2fbae89d385e64fbcb7a18da69572eb64d18c99e
MD5 b2d42d229070fe7c98e71d23ffbba18d
BLAKE2b-256 2be5f6c5c02225ad4d59be9a55ba481f3006b2e0c0ba851383a4b074c191dbc1

See more details on using hashes here.

File details

Details for the file cells2table-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: cells2table-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cells2table-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9fe1decdcda09634f874689fb1dadf3ad849c49948d89e6c27d915075a4f7dcc
MD5 09cb0f1a0f607c11116f5064da1cba49
BLAKE2b-256 f85ceb6f7ac8a0762e1294c8efc7a242ceafcf8de5edc2c042f9aea18bf2b4a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page