Skip to main content

yaxmldiff is Yet Another XML Differ

Project description

yaxmldiff – Yet Another XML Diff Library

This library checks if two XML documents seem semantically equivalent. If not, it produces something similar to a unified diff.

Example:

>>> from yaxmldiff import compare_xml
>>> print(compare_xml("<same/>", "  <same /> <!--ignored-->"))
None
>>> print(compare_xml("<doc><a id='a'/></doc>", "<doc><a name='a'/></doc>"))
  <doc>
    <a
-     id="a"
+     name="a"
    />
  </doc>

compare_xml()

Compare two XML documents.

If the documents are given as strings, they are parsed first. Alternatively, the documents can be given as an lxml.etree object.

Returns: None if both are equal, a diff otherwise.

Signature:

def compare_xml(
    left: str | Element,
    right: str | Element,
) -> str | None:

Examples

Example: equal documents

>>> print(compare_xml("<a/>", "<a/>"))
None

Example: different tag

>>> print(compare_xml("<a/>", "<b x='2'/>"))
- <a/>
+ <b .../>

Example: changed text

>>> print(compare_xml("<root><a/>foo</root>", "<root><a/>bar</root>"))
  <root>
    <a/>
-   foo
+   bar
  </root>

Example: nested changed text, collapses other nodes

>>> print(compare_xml(
...     "<root><uninteresting a='b'>foo</uninteresting><scope>a</scope></root>",
...     "<root><uninteresting a='b'>foo</uninteresting><scope>b</scope></root>",
... ))
  <root>
    <uninteresting ...>...</uninteresting>
    <scope>
-     a
+     b
    </scope>
  </root>

Example: inserted node

>>> print(compare_xml("<r><a/></r>", "<r><a/><b/></r>"))
  <r>
    <a/>
+   <b/>
  </r>

Example: changed attributes

>>> print(compare_xml(
...     "<a onlya='1' both='2' changed='3'/>",
...     "<a onlyb='1' both='2' changed='4'/>",
... ))
  <a both="2"
-   onlya="1"
-   changed="3"
+   changed="4"
+   onlyb="1"
  />

Example: can hande encoding declarations

>>> print(compare_xml(
...     "<?xml version='1.0' encoding='UTF-8'?><a/>",
...     "<a/>",
... ))
None

Example: comparison ignores surrounding space and newlines

>>> print(compare_xml("<a>b<c/></a>", "\n <a> \n b \n <c \n/> \n </a> \n "))
None

Example: pre-parse documents

>>> import lxml.etree
>>> print(compare_xml(lxml.etree.XML('<a parsed="yes"/>'), "<a parsed='no'/>"))
  <a
-   parsed="yes"
+   parsed="no"
  />

Related software

There are tons of XML diffing tools for Python.

Most closely related is lxml.doctestcompare. The lxml variant has lots of useful tools for doctests, such as ignoring subtrees with an <any> tag or content with an ... ellipsis. In contrast, yaxmldiff will compare two documents without further transformations. Another big difference is in the output. Whereas lxml will add inline annotations, yaxmldiff tries to emulate a unified diff, and will collapse uninteresting parts of the document.

Contributing

Use uv for virtualenv management. After installing uv, run uv sync --all-extras --dev to install dependencies.

Common development tasks are managed via the just tasks runner. Install it via your package manager. If in doubt, use pipx install rust-just. Once installed, run just or just qa for a complete QA pipeline with linters+typechecking+tests. Run just -l to get a list of all recipes.

License

Copyright 2021-2024 Lukas Atkinson

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Changelog

0.2.0 – 2024-09-29

  • minimum Python version is 3.8
  • (internal) packaging modernization

0.1.0 - 2021-06-13

  • initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yaxmldiff-0.2.0.tar.gz (30.5 kB view details)

Uploaded Source

Built Distribution

yaxmldiff-0.2.0-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file yaxmldiff-0.2.0.tar.gz.

File metadata

  • Download URL: yaxmldiff-0.2.0.tar.gz
  • Upload date:
  • Size: 30.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for yaxmldiff-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ea2d92072bd3be93e4e896ae037f21e855d13cb8c3cd35215453b218bc800641
MD5 5e542e801f8f0babc0b228086611a8b6
BLAKE2b-256 c31edbc8d5cfa845c6e5a125562ccdfdc1bb26d84891e15f4b6de082add76a3d

See more details on using hashes here.

File details

Details for the file yaxmldiff-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: yaxmldiff-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for yaxmldiff-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5377f4ab8ec949c7abb01e62443984b75ca0c319750268efe9d322b8b7f3c63f
MD5 1c05227f2d5ddd0f04b55123931f1c5f
BLAKE2b-256 4ecbd0b7642c47e9c42dbfcd67af03746525f127981c8e81a62df30b78cf086f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page