Skip to main content

Segmentation with D-FINE for kraken

Project description

D-FINE for document layout analysis

This is an adaptation//refactor of ArgoHA's implementation of the D-FINE object detector intended to put it in line with community standards in the (historical) document layout analysis community. It is a foundation for eventual integration into the kraken ATR engine.

Installation

Kraken is currently being rewritten to allow integration of new methods, such as the one in this repository, with plug-ins. This repository uses the architecture introduced by this rework which will eventually become kraken 7.0. The models produced by this repository are not going to be compatible with earlier kraken versions.

For the latest stable release run:

$ pip install dfine_kraken

Clone the repository and run:

$ pip install .

This will install a dfine that can be used to train models:

$ dfine ... train ...

Training

The basic syntax is very similar to kraken segmentation training using Page or ALTO XML files. During the rework many of the segmentation dataset filtering and transformation options have disappeared, being replaced by dictionaries mapping class labels to indices. As it is annoying to define mapping on the command line, mappings that do not assign one index to each class in the source data need to be defined in YAML experiment configuration files.

To train a basic model for 50 epochs from scratch:

$ dfine -d cuda:0 train *.xml

The default configuration trains lines and regions jointly. If this is not what you want take a look at the sample configuration file to see how to disable text line detection in bbox format.

Inference

Inference is integrated in kraken. You need to convert the checkpoint into weights first:

$ ketos convert -o dfine.safetensors checkpoint.ckpt

and then run kraken ... segment ... as usual:

$ kraken -i input.jpg out.xml -a segment -i dfine.safetensors

Pretrained Models

Pretrained region segmentation models trained on the LADaS dataset using the SegmOnto taxonomy (37 region types) are available for all model variants. They can be downloaded with kraken download:

Variant Size mAP@50 Download command
nano ~15 MB 0.2960 kraken get 10.5281/zenodo.18715384
small ~42 MB 0.3864 kraken get 10.5281/zenodo.18715381
medium ~79 MB 0.3915 kraken get 10.5281/zenodo.18715373
large ~126 MB 0.4308 kraken get 10.5281/zenodo.18715364
extra_large ~252 MB 0.4164 kraken get 10.5281/zenodo.18715367

The large variant achieves the best performance. The extra_large variant shows slight regression likely due to overfitting, so the large model is recommended for best accuracy.

$ kraken get 10.5281/zenodo.18715373
Processing ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 79.1/79.1 MB 0:00:00 0:00:02
Model dir: /home/mittagessen/.local/share/htrmopo/d58d541d-82cf-5ab2-b27b-a20fe2548229 (model files: ladas_m.safetensors)
$ kraken -i input.jpg out.xml -a segment -i ladas_m.safetensors
...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dfine_kraken-0.4.1.tar.gz (54.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dfine_kraken-0.4.1-py3-none-any.whl (65.4 kB view details)

Uploaded Python 3

File details

Details for the file dfine_kraken-0.4.1.tar.gz.

File metadata

  • Download URL: dfine_kraken-0.4.1.tar.gz
  • Upload date:
  • Size: 54.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dfine_kraken-0.4.1.tar.gz
Algorithm Hash digest
SHA256 b47d4a3cd3da97d30bfa3f2c8bdd2e5cfdde48f9f563b427734f866be113440f
MD5 a659c3792581e219f5e2f292f729bf41
BLAKE2b-256 aad114efd2bbb0b3928be7f6b9492c7d30bfdd3531fc8466956c669052b69d1d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dfine_kraken-0.4.1.tar.gz:

Publisher: test.yml on mittagessen/dfine_kraken

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dfine_kraken-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: dfine_kraken-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 65.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dfine_kraken-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 adace21d6576335a27899dd7cc7c218865160f1e2f075d0a1bb0aa1484178309
MD5 d0448e78fd19a65f56b5e38c56f49551
BLAKE2b-256 3a2a24d8425436ecae2c399ee9cc73fb7dcc9ff570fd70ee807d5c33eab48dd3

See more details on using hashes here.

Provenance

The following attestation bundles were made for dfine_kraken-0.4.1-py3-none-any.whl:

Publisher: test.yml on mittagessen/dfine_kraken

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page