Skip to main content

Table image parsing with cell detection models

Project description

cells2table

Parsing tables in document images with cell detection models

Implemented pipelines

PaddlePaddle

  • Classification model (wired / wireless)
  • Cell detection model with different weights for each class

Uses ONNX weights downloaded automatically from Hugging Face on first use.

Instalation

With uv, add to your project with:

uv add cells2table

ONNX models need a ONNX Runtime installed to run. You can install one on your own or use one of the optionals already configured.

Optional Description
docling For docling usage
huggingface For downloading models
onnx_cuda For NVIDIA GPUs
onnx_openvino For Intel GPUs and CPUs
onnx_cpu Default CPU runtime

Usage

cells2table only extract structural information from the tables. Another library is needed to extract content from the cells.

Docling

A docling plugin is provided to allow integrating cells2table in a complete pipeline.

Usage example:

from cells2table.docling import CustomDoclingTableStructureOptions

pipeline_options = PdfPipelineOptions(
    allow_external_plugins=True,
    table_structure_options=CustomDoclingTableStructureOptions(),
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options),
        InputFormat.IMAGE: PdfFormatOption(pipeline_options=pipeline_options),
    }
)

result = converter.convert("path/to/document.pdf")
print(result.document.export_to_markdown())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cells2table-0.3.0.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cells2table-0.3.0-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file cells2table-0.3.0.tar.gz.

File metadata

  • Download URL: cells2table-0.3.0.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cells2table-0.3.0.tar.gz
Algorithm Hash digest
SHA256 000d3a2c2565899fbe921cac778d58e0baf9302b31365d150b3c111191ff6568
MD5 730591c4d16bc08ce005c828f865a121
BLAKE2b-256 67a7fac9f64e417be78dd2a78a03aa2271dba5b641fafdc223d71bc6e51b6425

See more details on using hashes here.

File details

Details for the file cells2table-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: cells2table-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cells2table-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0432c1cdec47a2fc322e58cc09d92fd60a158a029d27e7736645a680791210a5
MD5 702bf96548ac0cd8e88419c105c12e0c
BLAKE2b-256 ca2023f6cdb64e184a0442a074f7bf1abf3dd4bef08e0fcd919fb6610bc61e79

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page