Skip to main content

A python package to validate XML file using against custom schema and schematron files.

Project description

acdh-xml-validator

A Python package for validating XML files against RelaxNG and Schematron schemas. This module provides a Validator class that can validate XML documents using both RelaxNG (.rng) and Schematron (.sch) schemas, particularly useful for TEI (Text Encoding Initiative) XML documents.

Usage (CLI)

RNG and Schematron

uv run validate-all --files "data/editions/*.xml" --rng "schemata/rng.rng" --schematron "schemata/schematron.sch"

RNG

uv run validate-rng --files "data/editions/*.xml" --rng "schemata/rng.rng"

SCHEMATRON

uv run validate-schematron --files "data/editions/*.xml" --schematron "schemata/schematron.sch"

Usage (Python)

import glob
from acdh_xml_validator import Validator


validator = Validator(
    path_to_rng="schemata/rng.rng",
    path_to_schematon="schemata/schematron.sch"
)

files = glob.glob("data/editions/*.xml")

for x in files:
    valid = validator.validate(x)

result:

test/xmls/L00003.xml is not valid according to test/schemata/rng.rng schema
  - test/xmls/L00003.xml:120:0:ERROR:RELAXNGV:RELAXNG_ERR_ELEMNAME: Expecting element idno, got rs
  - test/xmls/L00003.xml:119:0:ERROR:RELAXNGV:RELAXNG_ERR_ELEMNAME: Expecting element dateline, got signed
  - test/xmls/L00003.xml:119:0:ERROR:RELAXNGV:RELAXNG_ERR_ELEMWRONG: Did not expect element signed there
  - test/xmls/L00003.xml:87:0:ERROR:RELAXNGV:RELAXNG_ERR_ELEMWRONG: Did not expect element p there
  - test/xmls/L00003.xml:119:0:ERROR:RELAXNGV:RELAXNG_ERR_EXTRACONTENT: Element div has extra content: closer
  - test/xmls/L00003.xml:79:0:ERROR:RELAXNGV:RELAXNG_ERR_CONTENTVALID: Element text failed to validate content
test/xmls/L00107.xml is not valid according to test/schemata/tillich-schematron.sch
  - The @ref attribute for rs type @bible must start with a captial letter or with a number
  - The @ref attribute for rs type @bible must start with a captial letter or with a number
  - The @ref attribute for rs type @bible must start with a captial letter or with a number

develop

install the package in editable mode

uv pip install -e .
uv run python
>>> from acdh_xml_validator import hello
>>> hello()
'Hello you from acdh-xml-validator!'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acdh_xml_validator-1.0.0.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

acdh_xml_validator-1.0.0-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file acdh_xml_validator-1.0.0.tar.gz.

File metadata

  • Download URL: acdh_xml_validator-1.0.0.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.0

File hashes

Hashes for acdh_xml_validator-1.0.0.tar.gz
Algorithm Hash digest
SHA256 142bc22e4ccdc784d4111d0e85a1ae3410030b37f8d29122bf883848eaaad347
MD5 bd65d5177dea9db02bcf9ec1d6e5b106
BLAKE2b-256 b1e328a8cda4e93a487e03ad07b65454fb82ef9eaeff8f1828888e9b550fb608

See more details on using hashes here.

File details

Details for the file acdh_xml_validator-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for acdh_xml_validator-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d106ff4fae3b8de2bcbd7d3b5c9d92a5bbf11579c28bbdbe8a2976f7aedff3f0
MD5 66209b6869ef90ea4c4cfdff19f03403
BLAKE2b-256 2237a22bdf676cec588c9a4d1ccaaf1ece07e9310874321d975663b3ec4dd8d7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page