Skip to main content

Segmentation with D-FINE for kraken

Project description

D-FINE for document layout analysis

This is an adaptation//refactor of ArgoHA's implementation of the D-FINE object detector intended to put it in line with community standards in the (historical) document layout analysis community. It is a foundation for eventual integration into the kraken ATR engine.

Installation

Kraken is currently being rewritten to allow integration of new methods, such as the one in this repository, with plug-ins. This repository uses the architecture introduced by this rework which will eventually become kraken 7.0. The models produced by this repository are not going to be compatible with earlier kraken versions.

For the latest stable release run:

$ pip install dfine_kraken

Clone the repository and run:

$ pip install .

This will install a dfine that can be used to train models:

$ dfine ... train ...

Training

The basic syntax is very similar to kraken segmentation training using Page or ALTO XML files. During the rework many of the segmentation dataset filtering and transformation options have disappeared, being replaced by dictionaries mapping class labels to indices. As it is annoying to define mapping on the command line, mappings that do not assign one index to each class in the source data need to be defined in YAML experiment configuration files.

To train a basic model for 50 epochs from scratch:

$ dfine -d cuda:0 train *.xml

The default configuration trains lines and regions jointly. If this is not what you want take a look at the sample configuration file to see how to disable text line detection in bbox format.

Inference

Inference is integrated in kraken. You need to convert the checkpoint into weights first:

$ ketos convert -o dfine.safetensors checkpoint.ckpt

and then run kraken ... segment ... as usual:

$ kraken -i input.jpg out.xml -a segment -i dfine.safetensors

Pretrained Models

Pretrained region segmentation models trained on the LADaS dataset using the SegmOnto taxonomy (37 region types) are available for all model variants. They can be downloaded with kraken download:

Variant Size mAP@50 Download command
nano ~15 MB 0.2960 kraken get 10.5281/zenodo.18715384
small ~42 MB 0.3864 kraken get 10.5281/zenodo.18715381
medium ~79 MB 0.3915 kraken get 10.5281/zenodo.18715373
large ~126 MB 0.4308 kraken get 10.5281/zenodo.18715364
extra_large ~252 MB 0.4164 kraken get 10.5281/zenodo.18715367

The large variant achieves the best performance. The extra_large variant shows slight regression likely due to overfitting, so the large model is recommended for best accuracy.

$ kraken get 10.5281/zenodo.18715373
Processing ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 79.1/79.1 MB 0:00:00 0:00:02
Model dir: /home/mittagessen/.local/share/htrmopo/d58d541d-82cf-5ab2-b27b-a20fe2548229 (model files: ladas_m.safetensors)
$ kraken -i input.jpg out.xml -a segment -i ladas_m.safetensors
...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dfine_kraken-0.4.tar.gz (55.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dfine_kraken-0.4-py3-none-any.whl (64.8 kB view details)

Uploaded Python 3

File details

Details for the file dfine_kraken-0.4.tar.gz.

File metadata

  • Download URL: dfine_kraken-0.4.tar.gz
  • Upload date:
  • Size: 55.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dfine_kraken-0.4.tar.gz
Algorithm Hash digest
SHA256 4e44a85b660f68179437e411ce23146ae88e5fecd857d33fbd17e1d8a44e903d
MD5 c86a4fe84c0afbd5231da385cc145a3f
BLAKE2b-256 2082460889dba62491fe0d7b3d40c712e59bc199c08ccf90268ab6734880b9f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for dfine_kraken-0.4.tar.gz:

Publisher: test.yml on mittagessen/dfine_kraken

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dfine_kraken-0.4-py3-none-any.whl.

File metadata

  • Download URL: dfine_kraken-0.4-py3-none-any.whl
  • Upload date:
  • Size: 64.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dfine_kraken-0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ea1c0cd0caed60f974349fb9b5808608abf33a9aeb9637d741c48ead7dbc5f6f
MD5 199452a88931451fb6252b265614c7de
BLAKE2b-256 43b78debc38d52f34d7d6adde7bbd98301f784588d77f90da0cab85951eafa8f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dfine_kraken-0.4-py3-none-any.whl:

Publisher: test.yml on mittagessen/dfine_kraken

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page