Skip to main content

A lightweight library to compare XML documents with tolerance and ignore rules.

Project description

xmllens

Deep structural comparison for XML documents with per-path numeric tolerance and XPath-like targeting.

Overview

xmllens is a lightweight Python library for comparing two XML documents with fine-grained tolerance control.

It supports:

  • ✅ Global absolute (abs_tol) and relative (rel_tol) numeric tolerances
  • ✅ Per-path tolerance overrides via XPath-like expressions
  • ✅ Ignoring volatile or irrelevant XML elements
  • ✅ Detailed debug logs that explain why two XMLs differ

It’s ideal for comparing configuration files, XML-based API payloads, or serialized data models where small numeric drifts are expected.

Installation

pip install xmllens

Supported Path Patterns

xmllens implements a simplified subset of XPath syntax:

Pattern Description
/a/b/c Exact element path
/items/item[1]/price Specific index
/items/*/price Any element name
//price Recursive descent
/root/* Wildcard for any child element

Full API

compare_xml(
    xml_a: str,
    xml_b: str,
    *,
    ignore_fields: list[str] = None,
    abs_tol: float = 0.0,
    rel_tol: float = 0.0,
    abs_tol_fields: dict[str, float] = None,
    rel_tol_fields: dict[str, float] = None,
    epsilon: float = 1e-12,
    show_debug: bool = False,
) -> bool
Parameter Description
xml_a, xml_b XML documents as strings
ignore_fields XPath-like patterns to skip during comparison
abs_tol Global absolute numeric tolerance
rel_tol Global relative numeric tolerance
abs_tol_fields Per-path absolute tolerances
rel_tol_fields Per-path relative tolerances
epsilon Small float to absorb FP rounding errors
show_debug Enable detailed comparison logs

Examples

from xmllens import compare_xml

xml1 = "<sensor><temp>21.5</temp><humidity>48.0</humidity></sensor>"
xml2 = "<sensor><temp>21.7</temp><humidity>48.5</humidity></sensor>"

# Default tolerances
res = compare_xml(xml1, xml2, abs_tol=0.05, rel_tol=0.01, show_debug=True)
print(res)  # False
### Output (debug)

[NUMERIC COMPARE] /sensor/temp: 21.5 vs 21.7 | diff=0.200000 | abs_tol=0.05 | rel_tol=0.01 | threshold=0.217000
[MATCH NUMERIC] /sensor/temp: within tolerance
[NUMERIC COMPARE] /sensor/humidity: 48.0 vs 48.5 | diff=0.500000 | abs_tol=0.05 | rel_tol=0.01 | threshold=0.485000
[FAIL NUMERIC] /sensor/humidity  diff=0.500000 > threshold=0.485000
[FAIL IN ELEMENT] /sensor/humidity

Simple Value Mismatch

xml1 = "<root><x>1</x></root>"
xml2 = "<root><x>2</x></root>"

result = compare_xml(xml1, xml2)
print(result)  # False

Tag Mismatch

xml1 = "<root><x>1</x></root>"
xml2 = "<root><y>1</y></root>"

result = compare_xml(xml1, xml2)
print(result)  # False

Global Tolerances

Absolute Tolerance

xml1 = "<sensor><temp>20.0</temp></sensor>"
xml2 = "<sensor><temp>20.05</temp></sensor>"

result = compare_xml(xml1, xml2, abs_tol=0.1)
print(result)  # True

Relative Tolerance

xml1 = "<sensor><humidity>100.0</humidity></sensor>"
xml2 = "<sensor><humidity>104.0</humidity></sensor>"

result = compare_xml(xml1, xml2, rel_tol=0.05)
print(result)  # True  (5% tolerance)

Per-Path Tolerances

Per-Path Absolute Tolerance

xml1 = "<root><a>1.0</a><b>2.0</b></root>"
xml2 = "<root><a>1.5</a><b>2.9</b></root>"

abs_tol_fields = {"/root/b": 1.0}

result = compare_xml(xml1, xml2, abs_tol=0.5, abs_tol_fields=abs_tol_fields)
print(result)  # True

Per-Path Relative Tolerance

xml1 = "<values><x>100</x><y>200</y></values>"
xml2 = "<values><x>110</x><y>210</y></values>"

rel_tol_fields = {"/values/x": 0.2}  # 20%

result = compare_xml(xml1, xml2, rel_tol=0.05, rel_tol_fields=rel_tol_fields)
print(result)  # True

Ignoring fields

Simple Ignore Path

xml1 = "<root><id>1</id><timestamp>now</timestamp></root>"
xml2 = "<root><id>1</id><timestamp>later</timestamp></root>"

ignore_fields = ["/root/timestamp"]

result = compare_xml(xml1, xml2, ignore_fields=ignore_fields)
print(result)  # True

More Examples

Ignore multiple fields with different patterns:

  • Exact path: /user/profile/updated_at

  • Wildcard: /devices/*/debug

  • Recursive: //trace

xml1 = """
<data>
    <user>
        <id>7</id>
        <profile><updated_at>2025-10-14T10:00:00Z</updated_at><age>30</age></profile>
    </user>
    <devices>
        <device><id>d1</id><debug>alpha</debug><temp>20.0</temp></device>
        <device><id>d2</id><debug>beta</debug><temp>20.1</temp></device>
    </devices>
    <sessions>
        <session><events><event><meta><trace>abc</trace></meta><value>10.0</value></event></events></session>
        <session><events><event><meta><trace>def</trace></meta><value>10.5</value></event></events></session>
    </sessions>
</data>
"""

xml2 = """
<data>
    <user>
        <id>7</id>
        <profile><updated_at>2025-10-15T10:00:05Z</updated_at><age>30</age></profile>
    </user>
    <devices>
        <device><id>d1</id><debug>changed</debug><temp>20.05</temp></device>
        <device><id>d2</id><debug>changed</debug><temp>20.18</temp></device>
    </devices>
    <sessions>
        <session><events><event><meta><trace>xyz</trace></meta><value>10.01</value></event></events></session>
        <session><events><event><meta><trace>uvw</trace></meta><value>10.52</value></event></events></session>
    </sessions>
</data>
"""

ignore_fields = [
    "/data/user/profile/updated_at",
    "/data/devices/*/debug",
    "//trace",
]

result = compare_xml(
    xml1, xml2,
    ignore_fields=ignore_fields,
    abs_tol=0.05,
    rel_tol=0.02
)
print(result)  # True

combining absolute and relative tolerances for different fields.

xml1 = """
<station>
    <id>ST-42</id>
    <location>Paris</location>
    <version>1.0</version>
    <metrics>
        <temperature>21.5</temperature>
        <humidity>48.0</humidity>
        <pressure>1013.2</pressure>
        <wind_speed>5.4</wind_speed>
    </metrics>
    <status><battery_level>96.0</battery_level></status>
</station>
"""

xml2 = """
<station>
    <id>ST-42</id>
    <location>Paris</location>
    <version>1.03</version>
    <metrics>
        <temperature>21.6</temperature>
        <humidity>49.3</humidity>
        <pressure>1013.5</pressure>
        <wind_speed>5.6</wind_speed>
    </metrics>
    <status><battery_level>94.8</battery_level></status>
</station>
"""

abs_tol_fields = {
    "/station/version": 0.1,
    "/station/metrics/humidity": 2.0,
    "/station/status/battery_level": 2.0,
}

rel_tol_fields = {
    "/station/metrics/wind_speed": 0.05,
}

result = compare_xml(
    xml1, xml2,
    abs_tol=0.05,
    rel_tol=0.01,
    abs_tol_fields=abs_tol_fields,
    rel_tol_fields=rel_tol_fields
)
print(result)  # True

Tips

  • Elements are compared in order.

  • Attributes are compared strictly.

  • Whitespace is trimmed before comparison.

  • To ignore volatile elements (timestamps, UUIDs, etc.), use ignore_fields.

License

Apache License 2.0 — © 2025 Mohamed Tahri Contributions welcome 🤝

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xmllens-0.1.1.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xmllens-0.1.1-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file xmllens-0.1.1.tar.gz.

File metadata

  • Download URL: xmllens-0.1.1.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xmllens-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0eb607953c5c70463894f7453d9959809ac0d1b7126248e1b9b6ae655bac39d3
MD5 d21cd5da4cc06ed13957e5416f3ee2ab
BLAKE2b-256 625c4f4995b302f9c8ee4c6b3567a1c2cb8583324f2fb0ff50c6ec90730a093c

See more details on using hashes here.

File details

Details for the file xmllens-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: xmllens-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xmllens-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e1ce277220cfbd4bcd64474ba04199e6a3c6fd54f2dfd7ea907b9bd953a0a780
MD5 fcee08b4e960936067b0a2aa82bc1ac8
BLAKE2b-256 aec25937223f2b0a9b6df2e9647441631b05a20ff97d348190c02144111ac210

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page