Skip to main content

Table image parsing with cell detection models

Project description

cells2table

Parsing tables in document images with cell detection models

Implemented pipelines

PaddlePaddle

  • Classification model (wired / wireless)
  • Cell detection model with different weights for each class

Uses ONNX weights downloaded automatically from Hugging Face on first use.

Instalation

With uv, add to your project with:

uv add cells2table

ONNX models need a ONNX Runtime installed to run. You can install one on your own or use one of the optionals already configured.

Optional Description
docling For docling usage
huggingface For downloading models
onnx_cuda For NVIDIA GPUs
onnx_openvino For Intel GPUs and CPUs
onnx_cpu Default CPU runtime

Usage

cells2table only extract structural information from the tables. Another library is needed to extract content from the cells.

Docling

A docling plugin is provided to allow integrating cells2table in a complete pipeline.

Usage example:

from cells2table.docling import CustomDoclingTableStructureOptions

pipeline_options = PdfPipelineOptions(
    allow_external_plugins=True,
    table_structure_options=CustomDoclingTableStructureOptions(),
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options),
        InputFormat.IMAGE: PdfFormatOption(pipeline_options=pipeline_options),
    }
)

result = converter.convert("path/to/document.pdf")
print(result.document.export_to_markdown())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cells2table-0.3.1.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cells2table-0.3.1-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file cells2table-0.3.1.tar.gz.

File metadata

  • Download URL: cells2table-0.3.1.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cells2table-0.3.1.tar.gz
Algorithm Hash digest
SHA256 ae95cf84896d0ecf09989c93a48a3f9a2ade51e425bbfa259685e40e7bf67acc
MD5 725b67d00166237ff363ca2e7b06ef47
BLAKE2b-256 9f59c0e495a7e76d32ef912f7110b460a2d8247b1b94eb90b2f5f93cb7e5204f

See more details on using hashes here.

File details

Details for the file cells2table-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: cells2table-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 19.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cells2table-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 eec9dd3733e6c7c26b3c868fc25d4da7fb223c833e17d891c6aa900a0b4a9ed0
MD5 536fab478c9224be7582d3b158486c42
BLAKE2b-256 5a4d75dfb5a310ade04248954be3c7e7461b9baa812fdc10d5412b3f4a16d399

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page