Skip to main content

A tool that processes Old English texts and provides a toolkit for working with the text.

Project description

wyrdcraeft

Process Old English texts into structured JSON and generate morphology.

Why wyrdcraeft?

If you work with Old English (Anglo-Saxon) texts - editions, corpora, translation tooling, or digital humanities projects - you often need a single pipeline that turns raw or marked-up sources into a consistent, machine-readable form. wyrdcraeft provides that.

  • It ingests plain text and TEI XML, converts them into a standard JSON schema that is prose, verse and dialogue aware
  • Provides diacritic restoration Old English texts that have no diacritic marks.
  • Includes an Old English morphology generator based on established lexical and grammatical resources.
  • Provides other minor utilities for working with Old English text.

Use it from the command line or from Python, and avoid ad-hoc scripts and format fragmentation.

Features

  • Ingest Old English texts from text files and TEI XML.
  • Convert to a standard JSON format via deterministic heuristics, TEI parsing, or LLM-based extraction.
  • Handle both prose and verse (paragraphs, verse lines, dialogue, sections).
  • Generate Old English morphology forms using the migrated Python implementation from Ondřej Tichý's Perl-based generator (based on the Bosworth & Toller, An Anglo-Saxon Dictionary, 1898, and Wright & Wright, Old English Grammar, 1908).
  • Diacritic workflows: macron restoration and disambiguation tooling for normalized forms.

Installation

Prerequisites: Python 3.11–3.13.

From PyPI with pip:

pip install wyrdcraeft
wyrdcraeft --help

With uv:

sh -c "$(curl -fsSL https://astral.sh/uv/install)"
uv tool install wyrdcraeft
wyrdcraeft --help

With pipx:

pipx install wyrdcraeft
wyrdcraeft --help

From source (development):

git clone https://github.com/cmalek/wyrdcraeft.git
cd wyrdcraeft
uv sync --dev

Documentation

Full documentation (installation, quickstart, CLI, Python client, configuration, FAQ): https://oe_json_extractor.readthedocs.io

Contributing, Licensing and Provenance

Contributing

Contributing and coding standards are described in the documentation (runbook).

Licensing and Provenance

Bosworth-Toller Old English Dictionary

The OCR extracted text of the Bosworth-Toller Old English Dictionary used in this project is from the Germanic Lexicon Project. The scanning was done by Jason Burton, B. Dan Fairchild, Margaret Hoyt, Grace Mrowicki, Michael O'Keefe, Sarah Hartman, Finlay Logan, Sean Crist, Thomas McFadden, David Harrison, and Sean Crist; that data is in the public domain.

Morphological Analyser of Old English

  • The Old English morphology generator in wyrdcraeft is based on the work of Ondřej Tichý's thesis, Morphological Analyser of Old English (2017).
  • The upstream morphological generator Perl code and data is (c) Ondřej Tichý, is released under the CC BY 4.0 license. The modified Perl code itself, with Madeleine Thompson's changes, can be found at github:madeleineth/tichy_oe_generator.
  • Changes made to the morphology generator in this repository by the maintainers of wyrdcraeft are released under the MIT license.

All other code

  • All other code implemented directly by this project's maintainers are also released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wyrdcraeft-1.1.0.tar.gz (739.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wyrdcraeft-1.1.0-py3-none-any.whl (902.8 kB view details)

Uploaded Python 3

File details

Details for the file wyrdcraeft-1.1.0.tar.gz.

File metadata

  • Download URL: wyrdcraeft-1.1.0.tar.gz
  • Upload date:
  • Size: 739.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for wyrdcraeft-1.1.0.tar.gz
Algorithm Hash digest
SHA256 25cdfef136399d4c977cda00a76000412f2596f9931b4ac9d7da4b614d45529b
MD5 4da4524713bb6c17f7a39f6f27de96d4
BLAKE2b-256 b8604dad5229627873b2f3b085541502b8dff9b7610407b55cc7bf144146e92c

See more details on using hashes here.

File details

Details for the file wyrdcraeft-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: wyrdcraeft-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 902.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for wyrdcraeft-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 93b27b51c5718b66ce9eff2d649bd722ad5840198d78df8a7adec15aeb41b945
MD5 d49bda45e9c15762b8dbb023c26e7e03
BLAKE2b-256 e3a35975c88e4db7b8348e1a63f16b66cc82507ce3884b7f11637ae8ab596eb6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page