Table image parsing with cell detection models
Project description
cells2table
Parsing tables in document images with cell detection models
Implemented pipelines
PaddlePaddle
- Classification model (wired / wireless)
- Cell detection model with different weights for each class
Uses ONNX weights downloaded automatically from Hugging Face on first use.
Instalation
With uv, add to your project with:
uv add cells2table
ONNX models need a ONNX Runtime installed to run. You can install one on your own or use one of the optionals already configured.
| Optional | Description |
|---|---|
docling |
For docling usage |
huggingface |
For downloading models |
onnx_cuda |
For NVIDIA GPUs |
onnx_openvino |
For Intel GPUs and CPUs |
onnx_cpu |
Default CPU runtime |
Usage
cells2table only extract structural information from the tables. Another library is needed to extract content from the cells.
Docling
A docling plugin is provided to allow integrating cells2table in a complete pipeline.
Usage example:
from cells2table.docling import CustomDoclingTableStructureOptions
pipeline_options = PdfPipelineOptions(
allow_external_plugins=True,
table_structure_options=CustomDoclingTableStructureOptions(),
)
converter = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options),
InputFormat.IMAGE: PdfFormatOption(pipeline_options=pipeline_options),
}
)
result = converter.convert("path/to/document.pdf")
print(result.document.export_to_markdown())
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cells2table-0.3.1.tar.gz.
File metadata
- Download URL: cells2table-0.3.1.tar.gz
- Upload date:
- Size: 11.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae95cf84896d0ecf09989c93a48a3f9a2ade51e425bbfa259685e40e7bf67acc
|
|
| MD5 |
725b67d00166237ff363ca2e7b06ef47
|
|
| BLAKE2b-256 |
9f59c0e495a7e76d32ef912f7110b460a2d8247b1b94eb90b2f5f93cb7e5204f
|
File details
Details for the file cells2table-0.3.1-py3-none-any.whl.
File metadata
- Download URL: cells2table-0.3.1-py3-none-any.whl
- Upload date:
- Size: 19.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eec9dd3733e6c7c26b3c868fc25d4da7fb223c833e17d891c6aa900a0b4a9ed0
|
|
| MD5 |
536fab478c9224be7582d3b158486c42
|
|
| BLAKE2b-256 |
5a4d75dfb5a310ade04248954be3c7e7461b9baa812fdc10d5412b3f4a16d399
|