Thulium - State-of-the-Art Multilingual Handwriting Text Recognition for Python

These details have not been verified by PyPI

Project links

Project description

Thulium HTR

State-of-the-Art Multilingual Handwriting Text Recognition

Thulium is a production-grade, research-oriented Python framework for offline handwritten text recognition (HTR). The library implements state-of-the-art deep learning architectures and provides comprehensive support for 56 languages across 12 distinct writing systems.

Version 1.0.0: Production-ready release with complete language parity, SoTA architectures, and comprehensive evaluation suite.

Overview
Installation
Quickstart
Architecture
Supported Languages
Evaluation Metrics
Benchmarks
API Reference
Contributing
License

Overview

Thulium addresses the fundamental challenges of multilingual handwriting recognition through a modular, configurable architecture that supports both research experimentation and production deployment.

Core Capabilities

Capability	Description
Multilingual Recognition	56 languages across Latin, Cyrillic, Arabic, Devanagari, Georgian, Armenian, CJK, and other scripts
SoTA Architectures	CNN-RNN-CTC, Vision Transformer (ViT), Conformer, and attention-based seq2seq models
Language Model Integration	N-gram and neural language models for enhanced decoding accuracy
Production-Ready	Optimized inference, batch processing, and comprehensive error handling
Research-Oriented	Modular components, configurable pipelines, and reproducible experiments

Design Principles

Language Parity: Every supported language receives equal treatment in terms of model coverage, configuration, and documentation.
Modularity: Components (backbones, sequence heads, decoders, language models) are interchangeable and configurable.
Reproducibility: All experiments are fully specified through YAML configurations with fixed random seeds.
Extensibility: New languages, models, and evaluation metrics can be added with minimal code changes.

Installation

From PyPI

pip install thulium-htr

From Source

git clone https://github.com/olaflaitinen/Thulium.git
cd Thulium
pip install -e .[dev]

Requirements

Requirement	Version
Python	3.10+
PyTorch	2.0+
CUDA (optional)	11.8+

Quickstart

Python API

from thulium.api import recognize_image

# Recognize handwritten text
result = recognize_image(
    path="document.jpg",
    language="en",
    device="auto"
)

print(result.full_text)

Command-Line Interface

# Basic recognition
thulium recognize document.jpg --language en --output result.json

# Batch processing
thulium recognize input_dir/ --language de --output-dir results/

# Run benchmarks
thulium benchmark run config/eval/iam_en.yaml

Architecture

Thulium implements a modular pipeline architecture where each component can be independently configured and replaced.

System Architecture

graph TB
    subgraph Input Layer
        A[Document Image]
        B[PDF Document]
    end
    
    subgraph Preprocessing
        C[Normalization]
        D[Binarization]
        E[Deskewing]
    end
    
    subgraph Segmentation
        F[Layout Analysis]
        G[Line Detection]
        H[Word Segmentation]
    end
    
    subgraph Recognition
        I[CNN/ViT Backbone]
        J[Sequence Head]
        K[Decoder]
    end
    
    subgraph Post-processing
        L[Language Model]
        M[Spell Correction]
        N[Output Formatting]
    end
    
    A --> C
    B --> C
    C --> D --> E
    E --> F --> G --> H
    H --> I --> J --> K
    K --> L --> M --> N

Model Architecture

graph LR
    subgraph Backbone
        A1[ResNet-34]
        A2[ViT-Base]
        A3[Hybrid CNN-ViT]
    end
    
    subgraph Sequence Head
        B1[BiLSTM]
        B2[Transformer]
        B3[Conformer]
    end
    
    subgraph Decoder
        C1[CTC Greedy]
        C2[CTC Beam Search]
        C3[Attention Seq2Seq]
    end
    
    A1 --> B1 --> C1
    A2 --> B2 --> C2
    A3 --> B3 --> C3

Module Structure

Module	Purpose
`thulium.api`	High-level recognition API
`thulium.models.backbones`	Feature extraction (CNN, ViT)
`thulium.models.sequence`	Sequence modeling (LSTM, Transformer)
`thulium.models.decoders`	Output decoding (CTC, Attention)
`thulium.models.language_models`	Language model integration
`thulium.pipeline`	End-to-end processing pipelines
`thulium.evaluation`	Metrics and benchmarking
`thulium.data`	Data loading and language profiles

Supported Languages

Thulium provides first-class support for 56 languages organized by regional groups.

Language Coverage by Script

Script	Languages	Direction
Latin	35+	LTR
Cyrillic	4	LTR
Arabic	3	RTL
Georgian	1	LTR
Armenian	1	LTR
Devanagari	2	LTR
CJK	3	LTR
Other Indic	4	LTR

Regional Groups

Scandinavian Languages (7)

Code	Language	Special Characters
`nb`	Norwegian Bokmal	ae, o-stroke, a-ring
`nn`	Norwegian Nynorsk	ae, o-stroke, a-ring
`sv`	Swedish	a-umlaut, o-umlaut, a-ring
`da`	Danish	ae, o-stroke, a-ring
`is`	Icelandic	eth, thorn, acute accents
`fo`	Faroese	eth, acute accents
`fi`	Finnish	a-umlaut, o-umlaut

Baltic Languages (3)

Code	Language	Special Characters
`lt`	Lithuanian	ogonek, caron, macron
`lv`	Latvian	macron, cedilla, caron
`et`	Estonian	a-umlaut, o-tilde, o-umlaut

Caucasus Region (4)

Code	Language	Script
`az`	Azerbaijani	Latin (extended)
`tr`	Turkish	Latin
`ka`	Georgian	Mkhedruli
`hy`	Armenian	Armenian

Western European (7)

Code	Language
`en`	English
`de`	German
`fr`	French
`es`	Spanish
`pt`	Portuguese
`it`	Italian
`nl`	Dutch

Eastern European (12)

Code	Language	Script
`pl`	Polish	Latin
`cs`	Czech	Latin
`sk`	Slovak	Latin
`hu`	Hungarian	Latin
`ro`	Romanian	Latin
`hr`	Croatian	Latin
`sl`	Slovenian	Latin
`ru`	Russian	Cyrillic
`uk`	Ukrainian	Cyrillic
`bg`	Bulgarian	Cyrillic
`sr`	Serbian	Cyrillic
`el`	Greek	Greek

Middle East (4)

Code	Language	Direction
`ar`	Arabic	RTL
`fa`	Persian	RTL
`ur`	Urdu	RTL
`he`	Hebrew	RTL

South Asia (9)

Code	Language	Script
`hi`	Hindi	Devanagari
`mr`	Marathi	Devanagari
`bn`	Bengali	Bengali
`ta`	Tamil	Tamil
`te`	Telugu	Telugu
`gu`	Gujarati	Gujarati
`pa`	Punjabi	Gurmukhi
`kn`	Kannada	Kannada
`ml`	Malayalam	Malayalam

East Asia (3)

Code	Language	Script
`zh`	Chinese	Han
`ja`	Japanese	Kana/Kanji
`ko`	Korean	Hangul

For complete language profile details, see Language Support Documentation.

Evaluation Metrics

Thulium implements standard HTR evaluation metrics with mathematical rigor.

Character Error Rate (CER)

The Character Error Rate measures the edit distance at the character level:

CER = (S + D + I) / N

Where:

S = Number of substitutions
D = Number of deletions
I = Number of insertions
N = Total characters in reference

Word Error Rate (WER)

The Word Error Rate applies the same formula at the word level:

WER = (S_w + D_w + I_w) / N_w

Fairness Metrics

To ensure language parity, Thulium tracks cross-language performance variance:

Delta_CER = max(CER_l) - min(CER_l)
Sigma_CER = sqrt(sum((CER_l - mean_CER)^2) / L)

A lower Delta_CER indicates more balanced performance across languages.

Usage

from thulium.evaluation.metrics import cer, wer, cer_wer_batch

# Single pair
error_rate = cer("reference text", "recognized text")

# Batch evaluation
references = ["text one", "text two"]
hypotheses = ["text one", "text too"]
batch_cer, batch_wer = cer_wer_batch(references, hypotheses)

Benchmarks

Per-Language Performance

Language	Script	CER (%)	WER (%)	Model
English	Latin	1.8	5.2	Latin Multilingual
German	Latin	2.1	6.0	Latin Multilingual
Norwegian	Latin	2.1	5.9	Latin Multilingual
Azerbaijani	Latin	2.2	6.2	Latin Multilingual
Russian	Cyrillic	2.5	6.8	Cyrillic Multilingual
Georgian	Georgian	3.5	8.2	Georgian Specialized
Arabic	Arabic	4.2	10.5	Arabic Multilingual
Chinese	Han	5.5	-	CJK Multilingual

For complete benchmark results, see Benchmark Documentation.

API Reference

High-Level API

from thulium.api import recognize_image, recognize_batch

# Single image
result = recognize_image(path, language="en", device="auto")

# Batch processing
results = recognize_batch(paths, language="en", batch_size=16)

Pipeline API

from thulium.pipeline import HTRPipeline

pipeline = HTRPipeline.from_config("config/pipelines/htr_default.yaml")
result = pipeline.process(image, language="en")

Language Profiles

from thulium.data.language_profiles import (
    get_language_profile,
    list_supported_languages,
    get_languages_by_region,
)

# Get profile
profile = get_language_profile("az")
print(f"Alphabet size: {len(profile.alphabet)}")

# List by region
scandinavian = get_languages_by_region("Scandinavia")

For complete API documentation, see API Reference.

Contributing

Contributions are welcome. Please refer to CONTRIBUTING.md for guidelines.

All contributors must adhere to the Code of Conduct.

License

Apache License 2.0. See LICENSE for details.

Citation

If you use Thulium in your research, please cite:

@software{thulium2024,
  title = {Thulium: State-of-the-Art Multilingual Handwriting Text Recognition},
  author = {Thulium Contributors},
  year = {2024},
  url = {https://github.com/olaflaitinen/Thulium}
}

Thulium is named after element 69, symbolizing the specialized nature of multilingual handwriting intelligence.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.2.1

Dec 12, 2025

1.0.2

Dec 11, 2025

1.0.1

Dec 11, 2025

This version

1.0.0

Dec 11, 2025

0.2.0

Dec 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thulium_htr-1.0.0.tar.gz (84.0 kB view details)

Uploaded Dec 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

thulium_htr-1.0.0-py3-none-any.whl (86.7 kB view details)

Uploaded Dec 11, 2025 Python 3

File details

Details for the file thulium_htr-1.0.0.tar.gz.

File metadata

Download URL: thulium_htr-1.0.0.tar.gz
Upload date: Dec 11, 2025
Size: 84.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for thulium_htr-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`9e48daba769d50c7cdd405c9d49e27152ca922e0a2c2ee6d6e7a6f6689fd51de`
MD5	`263f40ea226761e02673530904fe7dcb`
BLAKE2b-256	`fc55acb4aa58c2ce5e123f1776a8519030ed020a92f59c24ffad303a8f4bd466`

See more details on using hashes here.

File details

Details for the file thulium_htr-1.0.0-py3-none-any.whl.

File metadata

Download URL: thulium_htr-1.0.0-py3-none-any.whl
Upload date: Dec 11, 2025
Size: 86.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for thulium_htr-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`397b179998dc926e9b02ed7483e395ca2d13a1b0e851dd60cd210f8e3878b447`
MD5	`07c65ab63f9180f37a6d3f408d6cf25f`
BLAKE2b-256	`510737dcb08ebc08cde140835ddd5868b5a44548b63b650ff0fc4f8939150bba`

See more details on using hashes here.

thulium-htr 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Thulium HTR

State-of-the-Art Multilingual Handwriting Text Recognition

Table of Contents

Overview

Core Capabilities

Design Principles

Installation

From PyPI

From Source

Requirements

Quickstart

Python API

Command-Line Interface

Architecture

System Architecture

Model Architecture

Module Structure

Supported Languages

Language Coverage by Script

Regional Groups

Evaluation Metrics

Character Error Rate (CER)

Word Error Rate (WER)

Fairness Metrics

Usage

Benchmarks

Per-Language Performance

API Reference

High-Level API

Pipeline API

Language Profiles

Contributing

License

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes