Skip to main content

htrflow is developed at Riksarkivet's AI-lab as an open-source package to simplify HTR

Reason this release was yanked:

This version of HTRflow may not include all needed dependencies.

Project description

htrflow

GitHub Repo stars License Build GitHub release GitHub docs

HTRFLOW Image

HTRFlow is an open source tool for handwritten text recognition. It is developed by the AI lab at the Swedish National Archives (Riksarkivet).

Docs

⚠️! Docs still under development

mkdocs

Installation

Package

pypi

Either install with uv or pip:

uv:

uv pip install htrflow[all]

pip:

pip install htrflow[all] 

This includes ultralytics and transformers models, but if you want also to use openmmlab models:

pip install htrflow"[all,openmmlab]" 

Note that this forces torch to 2.0.0, since openmmlabs depends on it for now..

From source

Requirements:

  • uv or pip
  • Python 3.10

Clone this repository and run

uv pip install -e .[all]

This will install the HTRFlow CLI and enable huggingface and ultralytics models in a virtual environment. If you also want to use openmmlab models such as RTMDet and Satrn, you also need to run:

uv pip install -e ."[all,openmmlab]"

Now activate the virtual enviroment with

source .venv/bin/activate

Or if you are using uv you can just run with uv run prefix before all your commands:

uv run <cmd>

The HTRFlow CLI is now available within the .venv from uv or where you installed the package. Try it by running:

htrflow pipeline examples/pipelines/demo.yaml examples/images/pages

This command runs HTRFlow on the three example pages in examples/images/pages and writes the output Page XML and Alto XML.

Usage

Once HTRFlow is installed, run it with:

htrflow pipeline <pipeline file> <input image(s)>

Pipelines

HTRFlow is configured with a pipeline file which describes what steps it should perform and which models it should use. Here is an example of a simple pipeline:

steps:
- step: Segmentation
  settings:
    model: RTMDet
    model_settings:
       model: Riksarkivet/rtmdet_lines
- step: TextRecognition
  settings:
    model: TrOCR
    model_settings:
       model: Riksarkivet/trocr-base-handwritten-swe
    generation_settings:
       num_beams: 1
- step: RemoveLowTextConfidenceLines
  settings:
    threshold: 0.9
- step: Export
  settings:
    dest: outputs/alto
    format: alto

This pipeline uses Riksarkivet/rtmdet_lines to detect the pages' text lines, then runs Riksarkivet/trocr-base-handwritten-swe to transcribe them, filters the text lines on their confidence score, and then exports the result to Alto XML.

See the demo pipeline examples/pipelines/demo.yaml for a more complex pipeline.

Built-in pipeline steps

HTRflow comes with several pre-defined pipeline steps out of the box. These include:

  • Inference, including text recognition and segmentation
  • Image preprocessing
  • Reading order detection
  • Filtering
  • Export

Custom pipeline steps

You can define your own custom pipeline step by subclassing PipelineStep and defining the run() method. It takes a Collection and returns a Collection:

class MyPipelineStep(PipelineStep):
    """A custom pipeline step"""
    def run(self, collection: Collection) -> Collection:
        for page in collection:
            # Do something
        return collection

You can add parameters to your pipeline step by also defining the __init__() method. It can take any number of arguments. Here, we add one argument, which can be accessed when the step is run:

class MyPipelineStep(PipelineStep):
    """A custom pipeline step"""
    def __init__(self, arg):
        self.arg = arg

    def run(self, collection: Collection) -> Collection:
        for page in collection:
            # Do something
            if self.arg:
              ...
        return collection

To use the pipeline step in a pipeline, add the following to your pipeline file:

steps:
  - step: MyPipelineStep
    settings: 
      arg: value

All key-value pairs listed under settings will be passed to the step's __init__() method. If the pipeline step doesn't need any arguments, you can omit settings.

For filtering and image processing operations, you can base your custom step on the base classes Prune and ProcessImages. Examples of this, and other pipeline steps, can be found in htrflow/pipeline/steps.py.

Models

The following model architectures are currently supported by HTRFlow:

Model Type Fine-tuned by the AI lab
TrOCR Text recognition Riksarkivet/trocr-base-handwritten-swe
Satrn Text recognition Riksarkivet/satrn_htr
RTMDet Segmentation Riksarkivet/rtmdet_lines
Riksarkivet/rtmdet_regions
Yolo Segmentation
DiT Image classification

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

htrflow-0.1.1-py3-none-any.whl (101.2 kB view details)

Uploaded Python 3

File details

Details for the file htrflow-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: htrflow-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 101.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for htrflow-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e284d8c9291eae893190877e79c5a52271705a9699eee74b1cd43d21e1c951b8
MD5 f8619738ecc54684597bb23160e00a13
BLAKE2b-256 08da6e0d9d2c7342ad473091dabb90a715b46734a5e7ade1e4eb6c1c25c6559b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page