Skip to main content

Segmentation with D-FINE for kraken

Project description

D-FINE for document layout analysis

This is an adaptation//refactor of ArgoHA's implementation of the D-FINE object detector intended to put it in line with community standards in the (historical) document layout analysis community. It is a foundation for eventual integration into the kraken ATR engine.

Installation

Kraken is currently being rewritten to allow integration of new methods, such as the one in this repository, with plug-ins. This repository uses the architecture introduced by this rework which will eventually become kraken 7.0. The models produced by this repository are not going to be compatible with earlier kraken versions.

Clone the repository and run:

$ pip install .

This will install a dfine that can be used to train models:

$ dfine ... train ...

Training

The basic syntax is very similar to kraken segmentation training using Page or ALTO XML files. During the rework many of the segmentation dataset filtering and transformation options have disappeared, being replaced by dictionaries mapping class labels to indices. As it is annoying to define mapping on the command line, mappings that do not assign one index to each class in the source data need to be defined in YAML experiment configuration files.

To train a basic model for 50 epochs from scratch:

$ dfine -d cuda:0 train *.xml

The default configuration trains lines and regions jointly. If this is not what you want take a look at the sample configuration file to see how to disable text line detection in bbox format.

Inference

Inference is integrated in kraken. You need to convert the checkpoint into weights first:

$ ketos convert -o dfine.safetensors checkpoint.ckpt

and then run kraken ... segment ... as usual:

$ kraken -i input.jpg out.xml -a segment -i dfine.safetensors

Pretrained Models

Pretrained region segmentation models trained on the LADaS dataset using the SegmOnto taxonomy (37 region types) are available for all model variants. They can be downloaded with kraken download:

Variant Size mAP@50 Download command
nano ~15 MB 0.2960 kraken get 10.5281/zenodo.18715384
small ~42 MB 0.3864 kraken get 10.5281/zenodo.18715381
medium ~79 MB 0.3915 kraken get 10.5281/zenodo.18715373
large ~126 MB 0.4308 kraken get 10.5281/zenodo.18715364
extra_large ~252 MB 0.4164 kraken get 10.5281/zenodo.18715367

The large variant achieves the best performance. The extra_large variant shows slight regression likely due to overfitting, so the large model is recommended for best accuracy.

$ kraken get 10.5281/zenodo.18715373
Processing ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 79.1/79.1 MB 0:00:00 0:00:02
Model dir: /home/mittagessen/.local/share/htrmopo/d58d541d-82cf-5ab2-b27b-a20fe2548229 (model files: ladas_m.safetensors)
$ kraken -i input.jpg out.xml -a segment -i ladas_m.safetensors
...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dfine_kraken-0.3.tar.gz (55.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dfine_kraken-0.3-py3-none-any.whl (64.7 kB view details)

Uploaded Python 3

File details

Details for the file dfine_kraken-0.3.tar.gz.

File metadata

  • Download URL: dfine_kraken-0.3.tar.gz
  • Upload date:
  • Size: 55.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dfine_kraken-0.3.tar.gz
Algorithm Hash digest
SHA256 97164419720d15dba05a81fd28180326ca626b341bd71d972834caab2bcadd5f
MD5 bd3ad1c99e03bdc15b7539451d5bc2c3
BLAKE2b-256 0eedc5304ad42743e72e55e5804d306f43ab6f9e1024555bfa98ef8adddefddb

See more details on using hashes here.

Provenance

The following attestation bundles were made for dfine_kraken-0.3.tar.gz:

Publisher: test.yml on mittagessen/dfine_kraken

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dfine_kraken-0.3-py3-none-any.whl.

File metadata

  • Download URL: dfine_kraken-0.3-py3-none-any.whl
  • Upload date:
  • Size: 64.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dfine_kraken-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 63e027c949bb33a9e83bacc2c7b186969eaa2b867b016540e6963bf9590533cd
MD5 11085be5a1e315ed295d0c4ad1ceb832
BLAKE2b-256 9863774f017d813a98c2b3ee4f47e9778da1a4ee1e90d1bbb18dda15bf8c8219

See more details on using hashes here.

Provenance

The following attestation bundles were made for dfine_kraken-0.3-py3-none-any.whl:

Publisher: test.yml on mittagessen/dfine_kraken

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page