Segmentation with D-FINE for kraken
Project description
D-FINE for document layout analysis
This is an adaptation//refactor of ArgoHA's implementation of the D-FINE object detector intended to put it in line with community standards in the (historical) document layout analysis community. It is a foundation for eventual integration into the kraken ATR engine.
Installation
Kraken is currently being rewritten to allow integration of new methods, such as the one in this repository, with plug-ins. This repository uses the architecture introduced by this rework which will eventually become kraken 7.0. The models produced by this repository are not going to be compatible with earlier kraken versions.
For the latest stable release run:
$ pip install dfine_kraken
Clone the repository and run:
$ pip install .
This will install a dfine that can be used to train models:
$ dfine ... train ...
Training
The basic syntax is very similar to kraken segmentation training using Page or ALTO XML files. During the rework many of the segmentation dataset filtering and transformation options have disappeared, being replaced by dictionaries mapping class labels to indices. As it is annoying to define mapping on the command line, mappings that do not assign one index to each class in the source data need to be defined in YAML experiment configuration files.
To train a basic model for 50 epochs from scratch:
$ dfine -d cuda:0 train *.xml
The default configuration trains lines and regions jointly. If this is not what you want take a look at the sample configuration file to see how to disable text line detection in bbox format.
Inference
Inference is integrated in kraken. You need to convert the checkpoint into weights first:
$ ketos convert -o dfine.safetensors checkpoint.ckpt
and then run kraken ... segment ... as usual:
$ kraken -i input.jpg out.xml -a segment -i dfine.safetensors
Pretrained Models
Pretrained region segmentation models trained on the
LADaS dataset using the
SegmOnto taxonomy (37 region types) are
available for all model variants. They can be downloaded with kraken download:
| Variant | Size | mAP@50 | Download command |
|---|---|---|---|
| nano | ~15 MB | 0.2960 | kraken get 10.5281/zenodo.18715384 |
| small | ~42 MB | 0.3864 | kraken get 10.5281/zenodo.18715381 |
| medium | ~79 MB | 0.3915 | kraken get 10.5281/zenodo.18715373 |
| large | ~126 MB | 0.4308 | kraken get 10.5281/zenodo.18715364 |
| extra_large | ~252 MB | 0.4164 | kraken get 10.5281/zenodo.18715367 |
The large variant achieves the best performance. The extra_large variant shows slight regression likely due to overfitting, so the large model is recommended for best accuracy.
$ kraken get 10.5281/zenodo.18715373
Processing ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 79.1/79.1 MB 0:00:00 0:00:02
Model dir: /home/mittagessen/.local/share/htrmopo/d58d541d-82cf-5ab2-b27b-a20fe2548229 (model files: ladas_m.safetensors)
$ kraken -i input.jpg out.xml -a segment -i ladas_m.safetensors
...
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dfine_kraken-0.4.2.tar.gz.
File metadata
- Download URL: dfine_kraken-0.4.2.tar.gz
- Upload date:
- Size: 55.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4390a836410970f6441412b0af764c2cc5d14b22f672baf731071413b585af2
|
|
| MD5 |
45e131b803691315197c58f8fb381064
|
|
| BLAKE2b-256 |
a167d65a50535ba15d0a4357e30a381e9587af5dc38e82e7e53c6b905fc1b536
|
Provenance
The following attestation bundles were made for dfine_kraken-0.4.2.tar.gz:
Publisher:
test.yml on mittagessen/dfine_kraken
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dfine_kraken-0.4.2.tar.gz -
Subject digest:
c4390a836410970f6441412b0af764c2cc5d14b22f672baf731071413b585af2 - Sigstore transparency entry: 1464081035
- Sigstore integration time:
-
Permalink:
mittagessen/dfine_kraken@6cf25988bc77696fd5994cfcbf75ba5d1afb8647 -
Branch / Tag:
refs/tags/0.4.2 - Owner: https://github.com/mittagessen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
test.yml@6cf25988bc77696fd5994cfcbf75ba5d1afb8647 -
Trigger Event:
push
-
Statement type:
File details
Details for the file dfine_kraken-0.4.2-py3-none-any.whl.
File metadata
- Download URL: dfine_kraken-0.4.2-py3-none-any.whl
- Upload date:
- Size: 66.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6dc9f5dbb4ed89722503ac738b7cacfd42e0e52564c4c738effa59d4a5b7304b
|
|
| MD5 |
2d55e6bd2d16a103a13e96770abb014f
|
|
| BLAKE2b-256 |
9bdeef75259ac77203abc22947928a2dd32718064a8345f7de4aa327b37b66b9
|
Provenance
The following attestation bundles were made for dfine_kraken-0.4.2-py3-none-any.whl:
Publisher:
test.yml on mittagessen/dfine_kraken
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dfine_kraken-0.4.2-py3-none-any.whl -
Subject digest:
6dc9f5dbb4ed89722503ac738b7cacfd42e0e52564c4c738effa59d4a5b7304b - Sigstore transparency entry: 1464081279
- Sigstore integration time:
-
Permalink:
mittagessen/dfine_kraken@6cf25988bc77696fd5994cfcbf75ba5d1afb8647 -
Branch / Tag:
refs/tags/0.4.2 - Owner: https://github.com/mittagessen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
test.yml@6cf25988bc77696fd5994cfcbf75ba5d1afb8647 -
Trigger Event:
push
-
Statement type: