Skip to main content

Utilty functions to work with TEI/XML-Documents

Project description

acdh-tei-pyutils

Github Workflow Tests Status PyPI version codecov

Utilty functions to work with TEI Documents

install

run pip install acdh-tei-pyutils

usage

some examples on how to use this package

parse an XML/TEI Document from and URL, string or file:

from acdh_tei_pyutils.tei import TeiReader

doc = TeiReader("https://raw.githubusercontent.com/acdh-oeaw/acdh-tei-pyutils/main/acdh_tei_pyutils/files/tei.xml")
print(doc.tree)
>>> <Element {http://www.tei-c.org/ns/1.0}TEI at 0x7ffb926f9c40>

doc = TeiReader("./acdh_tei_pyutils/files/tei.xml")
doc.tree
>>> <Element {http://www.tei-c.org/ns/1.0}TEI at 0x7ffb926f9c40>

write the current XML/TEI tree object to file

doc.tree_to_file("out.xml")
>>> 'out.xml'

see acdh_tei_pyutils/cli.py for further examples

command line scripts

Batch process a collection of XML/Documents by adding xml:id, xml:base next and prev attributes to the documents root element run:

add-attributes -g "/path/to/your/xmls/*.xml" -b "https://value/of-your/base.com"
add-attributes -g "../../xml/grundbuecher/gb-data/data/editions/*.xml" -b "https://id.acdh.oeaw.ac.at/grundbuecher"

Write mentions as listEvents into index-files:

mentions-to-indices -t "erwähnt in " -i "/path/to/your/xmls/indices/*.xml" -f "/path/to/your/xmls/editions/*.xml"

Write mentions as listEvents of index-files and copy enriched index entries into files

denormalize-indices -f "../../xml/schnitzler/schnitzler-tagebuch-data-public/editions/*.xml" -i "../../xml/schnitzler/schnitzler-tagebuch-data-public/indices/*.xml"
denormalize-indices -f "./data/*/*.xml" -i "./data/indices/*.xml" -m ".//*[@key]/@key" -x ".//tei:title[@level='a']/text()"
denormalize-indices -f "./data/*/*.xml" -i "./data/indices/*.xml" -m ".//*[@key]/@key" -x ".//tei:title[@level='a']/text()" -b pmb2121 -b pmb10815 -b pmb50

develop

  • project uses uv
  • linting/formatting uv run ruff check . uv run ruff format .
  • before commiting run flake8 to check linting and uv run coverage run -m pytest -v to run the tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acdh_tei_pyutils-2.1.1.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

acdh_tei_pyutils-2.1.1-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file acdh_tei_pyutils-2.1.1.tar.gz.

File metadata

  • Download URL: acdh_tei_pyutils-2.1.1.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for acdh_tei_pyutils-2.1.1.tar.gz
Algorithm Hash digest
SHA256 02abc4baeec879e33ed41af302b4a8dc34e36d5ee382257f4df19e6eef9a6f5c
MD5 a72d1604bdb4c4ff346d1bc1d57c0c11
BLAKE2b-256 747aa5ef0192f5268a342582dffa6305dd77e28ecebb6c10132c5064a7404437

See more details on using hashes here.

File details

Details for the file acdh_tei_pyutils-2.1.1-py3-none-any.whl.

File metadata

  • Download URL: acdh_tei_pyutils-2.1.1-py3-none-any.whl
  • Upload date:
  • Size: 19.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for acdh_tei_pyutils-2.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 afa3c43bb23165de0d5766e261d1bded0b64f9d1ae6d7da33501ef127ebd3997
MD5 b829d053d9864e4fb9c427a1569ea22f
BLAKE2b-256 1369d085477033c7100eddc03cd11f3bfcef761acc8f5ba7178767110920c652

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page