Skip to main content

Segmentation with D-FINE for kraken

Project description

D-FINE for document layout analysis

This is an adaptation//refactor of ArgoHA's implementation of the D-FINE object detector intended to put it in line with community standards in the (historical) document layout analysis community. It is a foundation for eventual integration into the kraken ATR engine.

Installation

Kraken is currently being rewritten to allow integration of new methods, such as the one in this repository, with plug-ins. This repository uses the architecture introduced by this rework which will eventually become kraken 7.0. The models produced by this repository are not going to be compatible with earlier kraken versions.

For the latest stable release run:

$ pip install dfine_kraken

Clone the repository and run:

$ pip install .

This will install a dfine that can be used to train models:

$ dfine ... train ...

Training

The basic syntax is very similar to kraken segmentation training using Page or ALTO XML files. During the rework many of the segmentation dataset filtering and transformation options have disappeared, being replaced by dictionaries mapping class labels to indices. As it is annoying to define mapping on the command line, mappings that do not assign one index to each class in the source data need to be defined in YAML experiment configuration files.

To train a basic model for 50 epochs from scratch:

$ dfine -d cuda:0 train *.xml

The default configuration trains lines and regions jointly. If this is not what you want take a look at the sample configuration file to see how to disable text line detection in bbox format.

Inference

Inference is integrated in kraken. You need to convert the checkpoint into weights first:

$ ketos convert -o dfine.safetensors checkpoint.ckpt

and then run kraken ... segment ... as usual:

$ kraken -i input.jpg out.xml -a segment -i dfine.safetensors

Pretrained Models

Pretrained region segmentation models trained on the LADaS dataset using the SegmOnto taxonomy (37 region types) are available for all model variants. They can be downloaded with kraken download:

Variant Size mAP@50 Download command
nano ~15 MB 0.2960 kraken get 10.5281/zenodo.18715384
small ~42 MB 0.3864 kraken get 10.5281/zenodo.18715381
medium ~79 MB 0.3915 kraken get 10.5281/zenodo.18715373
large ~126 MB 0.4308 kraken get 10.5281/zenodo.18715364
extra_large ~252 MB 0.4164 kraken get 10.5281/zenodo.18715367

The large variant achieves the best performance. The extra_large variant shows slight regression likely due to overfitting, so the large model is recommended for best accuracy.

$ kraken get 10.5281/zenodo.18715373
Processing ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 79.1/79.1 MB 0:00:00 0:00:02
Model dir: /home/mittagessen/.local/share/htrmopo/d58d541d-82cf-5ab2-b27b-a20fe2548229 (model files: ladas_m.safetensors)
$ kraken -i input.jpg out.xml -a segment -i ladas_m.safetensors
...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dfine_kraken-0.4.2.tar.gz (55.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dfine_kraken-0.4.2-py3-none-any.whl (66.4 kB view details)

Uploaded Python 3

File details

Details for the file dfine_kraken-0.4.2.tar.gz.

File metadata

  • Download URL: dfine_kraken-0.4.2.tar.gz
  • Upload date:
  • Size: 55.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dfine_kraken-0.4.2.tar.gz
Algorithm Hash digest
SHA256 c4390a836410970f6441412b0af764c2cc5d14b22f672baf731071413b585af2
MD5 45e131b803691315197c58f8fb381064
BLAKE2b-256 a167d65a50535ba15d0a4357e30a381e9587af5dc38e82e7e53c6b905fc1b536

See more details on using hashes here.

Provenance

The following attestation bundles were made for dfine_kraken-0.4.2.tar.gz:

Publisher: test.yml on mittagessen/dfine_kraken

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dfine_kraken-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: dfine_kraken-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 66.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dfine_kraken-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6dc9f5dbb4ed89722503ac738b7cacfd42e0e52564c4c738effa59d4a5b7304b
MD5 2d55e6bd2d16a103a13e96770abb014f
BLAKE2b-256 9bdeef75259ac77203abc22947928a2dd32718064a8345f7de4aa327b37b66b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dfine_kraken-0.4.2-py3-none-any.whl:

Publisher: test.yml on mittagessen/dfine_kraken

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page