Skip to main content

Compare xml files with svg output.

Project description

XmlXdiff

XmlXdiff was inspired by X-Diff.

This is not a bullet prove library (till now). It s more a playground to get in touch with comparing tree structures and presenting the resulting in a charming way.

dependencies

  • PySide2
  • svgwrite
  • lxml

installation

python pip XmlXdiff

fist step

from XmlXdiff.XReport import DrawXmlDiff

_xml1 = """<root><deleted>with content</deleted><unchanged/><changed name="test1" /></root>"""
_xml2 = """<root><unchanged/><changed name="test2" /><added/></root>"""

with open("test1.xml", "w") as f:
    f.write(_xml1)

with open("test2.xml", "w") as f:
    f.write(_xml2)

x = DrawXmlDiff("test1.xml", "test2.xml")
x.saveSvg('xdiff.svg')

status quo

XmlXdiff example

implementation

Each xml element is identified by it's xpath and a hash calculated by selecting relevant information.

  1. mark all xml elements as changed
  2. mark unchanged xml elements
  3. mark moved xml elements
  4. mark xml elements identified by tag name and attribute names
  5. mark xml elements identified by attributes values and element text
  6. mark xml elements identified by tag name
  7. mark xml elements with xpath that do not exist in the other xml tree as added/deleted
  8. mark xml elements that have no child xml elements that are marked as changed as verified
  9. all xml elements that are still marked as changed have to be investigated

The selected order may change in future. This is still under investigation.

performance

test1: delta_t=0.0469s xml_elements=63
test2: delta_t=0.0156s xml_elements=5
test3: delta_t=0.0000s xml_elements=4
test4: delta_t=0.0313s xml_elements=32
test5: delta_t=0.0312s xml_elements=34
test6: delta_t=0.0312s xml_elements=34
test7: delta_t=0.0156s xml_elements=8
test8: delta_t=0.0780s xml_elements=67
test9: delta_t=5.1238s xml_elements=6144
test11: delta_t=0.0409s xml_elements=34
test12: delta_t=0.0312s xml_elements=45
test13: delta_t=0.0469s xml_elements=75

coverage

Name                               Stmts   Miss  Cover
------------------------------------------------------
lib\XmlXdiff\XDiffer.py              169     50    70%
lib\XmlXdiff\XHash.py                 88     19    78%
lib\XmlXdiff\XPath.py                 57     18    68%
lib\XmlXdiff\XReport\XRender.py       60     44    27%
lib\XmlXdiff\XReport\__init__.py     279     93    67%
lib\XmlXdiff\XTypes.py               156     98    37%
lib\XmlXdiff\__init__.py               3      2    33%
------------------------------------------------------
TOTAL                                812    324    60%

open issues

  • xdiff cost rating for matching couples
  • performance analysis and improvements (different hash algorithms, ...)
  • rework xml elements identification readability/performance issues
  • if there are some users, improve interface

release notes

v0.2.2:

  • search areas are split into segments between unchanged xml nodes
  • added/delted/verfied to be added
  • overlapping search areas possible now (merge proposals)

documentation

Tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

XmlXdiff-0.2.3-py3-none-any.whl (12.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page