HTR / OCR models evaluation agnostic Python package.
Project description
Cthulhu
Cthulhu is a lightweight Python library for evaluating the quality of HTR (Handwritten Text Recognition) and OCR (Optical Character Recognition) transcriptions.
It compares a ground truth against a prediction and returns a set of standard metrics (WER, CER, Levenshtein distance, etc.), with optional text normalisation transforms applied before scoring.
Originally derived from kami-lib — redesigned as a standalone evaluation toolkit with no deep-learning dependencies.
Features
- Flexible input — accepts plain strings,
.txtfiles, ALTO XML, and PAGE-XML - Rich metrics — WER, CER, WACC, MER, CIP, CIL, ROUGE-1/2/L, Levenshtein, Hamming, weighted variants
- Text transforms — normalise both sequences before scoring (remove digits, punctuation, diacritics, change case)
- No heavy dependencies — no Kraken, no PyTorch; pure Python + three lightweight packages
Installation
pip install cthulhu-eval
Dependencies:
| Package | Role |
|---|---|
python-Levenshtein >= 0.21 |
Fast C-extension for edit distance |
Unidecode >= 1.3 |
Diacritics removal |
termcolor >= 1.1 |
Coloured log output |
Python: 3.9 or later
Quick start
from cthulhu.Cthulhu import Cthulhu
# Two plain strings
k = Cthulhu(["ground truth text", "predicted text"])
print(k.scores.board) # full metrics dict
print(k.scores.wer) # word error rate
print(k.scores.cer) # character error rate
Input formats
data must be a list of exactly two elements. Each element can be:
| Type | Example |
|---|---|
| Plain string | "mon texte de référence" |
| Text file | "./gt.txt" |
| ALTO XML (v2–v4) | "./document_alto.xml" |
| PAGE-XML (PcGts) | "./document_page.xml" |
Mix and match freely:
# XML ground truth vs plain-text prediction
k = Cthulhu(["./gt_alto.xml", "./prediction.txt"])
# Two XML files
k = Cthulhu(["./gt_page.xml", "./pred_page.xml"])
# String vs text file
k = Cthulhu(["reference string", "./pred.txt"])
Metrics
All metrics are available via k.scores.board (dict) or as individual attributes on k.scores.
| Metric | Attribute | Description |
|---|---|---|
| Levenshtein distance (char) | lev_distance_char |
Edit distance at character level |
| Levenshtein distance (word) | lev_distance_words |
Edit distance at word level |
| Hamming distance | hamming |
Char-level; "Ø" if lengths differ |
| WER | wer |
Word Error Rate |
| CER | cer |
Character Error Rate |
| WACC | wacc |
Word Accuracy (1 − WER) |
| WER Hunt | wer_hunt |
WER with halved insertion/deletion costs |
| MER | mer |
Match Error Rate |
| CIP | cip |
Character Information Preserved |
| CIL | cil |
Character Information Lost |
| ROUGE-1 | rouge_1 |
Unigram overlap (precision, recall, F1) |
| ROUGE-2 | rouge_2 |
Bigram overlap (precision, recall, F1) |
| ROUGE-L | rouge_l |
Longest Common Subsequence (precision, recall, F1) |
| Hits | hits |
Matching characters |
| Substitutions | substs |
Character substitutions |
| Deletions | deletions |
Character deletions |
| Insertions | insertions |
Character insertions |
Custom error weights
k = Cthulhu(
["reference", "prediction"],
insertion_cost=0.5,
deletion_cost=0.5,
substitution_cost=1.0,
)
Display options
k = Cthulhu(
["reference", "prediction"],
percent=True, # express rates as percentages (e.g. 17.2 instead of 0.172)
truncate=True, # truncate floats
round_digits=".001", # precision
)
Text transforms
Apply normalisation steps before computing metrics. Each transform is scored individually and all together, letting you see the impact of each choice.
k = Cthulhu(
["Déjà 13 fois, Maxime !", "deja 13 fois maxime"],
apply_transforms="XP", # remove diacritics + punctuation
)
# k.scores.board contains:
# {
# "default": {...}, # raw scores
# "remove_diacritics": {...}, # after X only
# "remove_punctuation":{...}, # after P only
# "all_transforms": {...}, # after X + P combined
# "Length_reference": ...,
# "Total_diacritics_removed_from_reference": ...,
# ...
# }
Transform codes:
| Code | Name | Effect |
|---|---|---|
D |
Remove digits | "1871" → "" |
U |
Uppercase | "texte" → "TEXTE" |
L |
Lowercase | "TEXTE" → "texte" |
P |
Remove punctuation | "Bonjour !" → "Bonjour " |
X |
Remove diacritics | "étaient" → "etaient" |
Codes can be combined freely: "XP", "DLP", "XPLU", etc.
You can also use the transformation classes directly:
from cthulhu.preprocessing.transformation import (
ToCompose,
RemoveDiacritics,
RemovePunctuation,
ToLowerCase,
RemoveNonUsefulWords,
RemoveDigits,
RemoveSpecificWords,
SubRegex,
Strip,
)
# Apply a chain of transforms to any string or list of strings
result = ToCompose(
["Déjà 13 fois, Maxime !", ""],
[RemoveDiacritics(), RemovePunctuation(), ToLowerCase(), RemoveNonUsefulWords()]
)
print(result.reference) # "deja 13 fois maxime"
Using Scorer directly
For lower-level access, use the Scorer class without the Cthulhu facade:
from cthulhu.metrics.evaluation import Scorer
scorer = Scorer(
reference="Six semaines plus tard",
prediction="Six semaiNEs plus tard",
show_percent=True,
truncate_score=True,
round_digits=".001",
)
print(scorer.wer)
print(scorer.cer)
print(scorer.board)
Project structure
cthulhu/
├── cthulhu/
│ ├── Cthulhu.py # Main facade class
│ ├── metrics/
│ │ ├── _base_metrics.py # Encoding helpers, rounding utilities
│ │ └── evaluation.py # Scorer class
│ ├── parser/
│ │ ├── parser_xml.py # ALTO / PAGE-XML parser (stdlib only)
│ │ └── parser_text.py # Plain-text file reader
│ ├── preprocessing/
│ │ └── transformation.py # Transform classes
│ └── utils/
│ └── _utils.py # Logging, timing decorator
├── tests/
├── datatest/
├── requirements.txt
└── setup.py
Running tests
python -m pytest tests/ -v
19 tests, ~0.06 s.
Roadmap
- API-based HTR inference — send an image to an external model endpoint and compare the result against a ground-truth XML, without any local model
Authors & licence
Original work by Alix Chagué and Lucas Terriel (Inria). MIT Licence.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cthulhu_eval-0.3.0.tar.gz.
File metadata
- Download URL: cthulhu_eval-0.3.0.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d70168c8ec51cf0dd7bb44213fa47e0685135f9056901d5ca049475d767725ec
|
|
| MD5 |
de2cdfcc075a9e4d3050cb1479f7cee9
|
|
| BLAKE2b-256 |
3db8768529c67740f58a9bafa527f7708e0bbf138a03b9e961be3d7dcdc3ae40
|
File details
Details for the file cthulhu_eval-0.3.0-py3-none-any.whl.
File metadata
- Download URL: cthulhu_eval-0.3.0-py3-none-any.whl
- Upload date:
- Size: 19.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab7a477276a2d42b4cb8cf98317731b0b57c8bbb8ca7dab1d5589faa996053d9
|
|
| MD5 |
3a518ba38328cae5cf0b017348b9cb0d
|
|
| BLAKE2b-256 |
f2d6445f144a557c4f265750cbaa97f9d2ccc00f674bacb889a3359365864a8c
|