Compare xml files with svg output.
XmlXdiff was inspired by X-Diff.
Since version 0.3.2 the distance cost's algorithm is replaced by parent-identification. This might by a wrong decision but the result's for huge xml documents (see. test 9) improved in performance and quality.
This is not a bullet prove library (till now). It s more a playground to get in touch with comparing tree structures and presenting the resulting in a charming way.
python pip XmlXdiff
from diffx import main _xml1 = './simple/xml1.xml' _xml2 = './simple/xml2.xml' main.compare_xml(_xml1, _xml2) main.save('./simple/diffx_file.svg')
# file example from diffx import main _xml1 = './simple/xml1.xml' _xml2 = './simple/xml2.xml' main.compare_xml(_xml1, _xml2) main.save('./simple/diffx_file.svg')
Each xml element is identified by it's xpath and a hash calculated by selecting relevant information. Start with the identification of huge xml blocks (changed/moved). Identification of parent elements by tag, text-pre, text-post, attribute-names and attribute-values. Parent xml blocks can contain further parent xml blocks.
<tag attribute-name:"attribute-value" ...> text-pre <... children ...> text-post </tag>
- mark all xml elements as changed
- iterate over parent blocks, starting with maximum children to parent blocks with less children
- mark unchanged xml elements of current parent
- mark moved xml elements of current parent
- mark xml elements identified by tag name and attribute names of the current parent
- mark xml elements identified by attributes values and element text of the current parent
- mark xml elements identified by tag name of the current parent
- mark xml elements with xpath that do not exist in the other xml tree as added/deleted of the current parent
- Repeat 3. till all xml elements are identified
All xml elements that are still marked as changed have to be investigated
test1: delta_t=0.0699s xml_elements=63 test2: delta_t=0.0104s xml_elements=5 test3: delta_t=0.0154s xml_elements=10 test4: delta_t=0.0240s xml_elements=32 test5: delta_t=0.0258s xml_elements=34 test6: delta_t=0.0290s xml_elements=34 test7: delta_t=0.0124s xml_elements=8 test8: delta_t=0.1027s xml_elements=67 test9: delta_t=4.2290s xml_elements=6144 test11: delta_t=0.0298s xml_elements=34 test12: delta_t=0.0288s xml_elements=45 test13: delta_t=0.0442s xml_elements=75
Name Stmts Miss Cover ------------------------------------------------------------ lib\diffx\__init__.py 21 4 81% lib\diffx\base.py 107 2 98% lib\diffx\differ.py 170 19 89% lib\diffx\hash.py 71 0 100% lib\diffx\svg\__init__.py 0 0 100% lib\diffx\svg\coloured_text.py 21 0 100% lib\diffx\svg\coloured_without_text.py 12 5 58% lib\diffx\svg\compact.py 340 34 90% lib\diffx\svg\render_text.py 76 2 97% lib\diffx\xpath.py 54 3 94% ------------------------------------------------------------ TOTAL 872 69 92%
- performance analysis and improvements (different hash algorithms, ...)
- if there are some users, improve interface
- investigation of merge interfaces
- XmlXdiff renamed to diffx
- ui improved diffx.main added as entry point
- code refactored - pythonic, pep8
- text block introduced
- performance improved
- source code clean up
- diff text without spaces
- static code quality tools introduced
- implemented parent-identification without children context
- split segments replaced by parent-identification (no dependency to number of child's nor content of child's)
- color scheme changed
- coverage improved
- search areas are split into segments between unchanged xml nodes
- added/deleted/verified to be added
- overlapping search areas possible now (merge proposals)
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size XmlXdiff-1.0.0-py3-none-any.whl (17.2 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|