A tool that processes Old English texts and provides a toolkit for working with the text.
Project description
wyrdcraeft
Process Old English texts into structured JSON and generate morphology.
Why wyrdcraeft?
If you work with Old English (Anglo-Saxon) texts - editions, corpora, translation tooling, or digital humanities projects - you often need a single pipeline that turns raw or marked-up sources into a consistent, machine-readable form. wyrdcraeft provides that.
- It ingests plain text and TEI XML, converts them into a standard JSON schema that is prose, verse and dialogue aware
- Provides diacritic restoration Old English texts that have no diacritic marks.
- Includes an Old English morphology generator based on established lexical and grammatical resources.
- Provides other minor utilities for working with Old English text.
Use it from the command line or from Python, and avoid ad-hoc scripts and format fragmentation.
Features
- Ingest Old English texts from text files and TEI XML.
- Convert to a standard JSON format via deterministic heuristics, TEI parsing, or LLM-based extraction.
- Handle both prose and verse (paragraphs, verse lines, dialogue, sections).
- Generate Old English morphology forms using the migrated Python implementation from Ondřej Tichý's Perl-based generator (based on the Bosworth & Toller, An Anglo-Saxon Dictionary, 1898, and Wright & Wright, Old English Grammar, 1908).
- Diacritic workflows: macron restoration and disambiguation tooling for normalized forms.
Installation
Prerequisites: Python 3.11–3.13.
From PyPI with pip:
pip install wyrdcraeft
wyrdcraeft --help
With uv:
sh -c "$(curl -fsSL https://astral.sh/uv/install)"
uv tool install wyrdcraeft
wyrdcraeft --help
With pipx:
pipx install wyrdcraeft
wyrdcraeft --help
From source (development):
git clone https://github.com/cmalek/wyrdcraeft.git
cd wyrdcraeft
uv sync --dev
Documentation
Full documentation (installation, quickstart, CLI, Python client, configuration, FAQ): https://oe_json_extractor.readthedocs.io
Contributing, Licensing and Provenance
Contributing
Contributing and coding standards are described in the documentation (runbook).
Licensing and Provenance
Bosworth-Toller Old English Dictionary
The OCR extracted text of the Bosworth-Toller Old English Dictionary used in this project is from the Germanic Lexicon Project. The scanning was done by Jason Burton, B. Dan Fairchild, Margaret Hoyt, Grace Mrowicki, Michael O'Keefe, Sarah Hartman, Finlay Logan, Sean Crist, Thomas McFadden, David Harrison, and Sean Crist; that data is in the public domain.
Morphological Analyser of Old English
- The Old English morphology generator in
wyrdcraeftis based on the work of Ondřej Tichý's thesis, Morphological Analyser of Old English (2017). - The upstream morphological generator Perl code and data is (c) Ondřej Tichý, is released under the CC BY 4.0 license. The modified Perl code itself, with Madeleine Thompson's changes, can be found at github:madeleineth/tichy_oe_generator.
- Changes made to the morphology generator in this repository by the maintainers of
wyrdcraeftare released under the MIT license.
All other code
- All other code implemented directly by this project's maintainers are also released under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wyrdcraeft-1.1.0.tar.gz.
File metadata
- Download URL: wyrdcraeft-1.1.0.tar.gz
- Upload date:
- Size: 739.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25cdfef136399d4c977cda00a76000412f2596f9931b4ac9d7da4b614d45529b
|
|
| MD5 |
4da4524713bb6c17f7a39f6f27de96d4
|
|
| BLAKE2b-256 |
b8604dad5229627873b2f3b085541502b8dff9b7610407b55cc7bf144146e92c
|
File details
Details for the file wyrdcraeft-1.1.0-py3-none-any.whl.
File metadata
- Download URL: wyrdcraeft-1.1.0-py3-none-any.whl
- Upload date:
- Size: 902.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93b27b51c5718b66ce9eff2d649bd722ad5840198d78df8a7adec15aeb41b945
|
|
| MD5 |
d49bda45e9c15762b8dbb023c26e7e03
|
|
| BLAKE2b-256 |
e3a35975c88e4db7b8348e1a63f16b66cc82507ce3884b7f11637ae8ab596eb6
|