Parsed XML allows you to use XML objects in the Zope 2 environment.
- Parsed XML
Parsed XML allows you to use XML objects in the Zope environment. You can create XML documents in Zope and leverage Zope to format, query, and manipulate XML.
Parsed XML consists of a DOM storage, a builder that uses PyExpat to parse XML into the DOM, and a management proxy product that provides Zope management features for a DOM tree. It also includes a system to create paths to nodes in Zope URLs (NodePath).
The Parsed XML product parses XML into a Zopish DOM tree. The elements of this tree support persistence, acquisition, etc.. The document and subnodes are editable and manageable through management proxy objects, and the underlying DOM tree can be directly manipulated via DTML, Python, etc..
We’re implementing a lean, mean DOM tree for pure DOM access, and a tree of proxy shells to handle management and take care of the conveniences like publishing and security. The ManageableNodes are the proxy objects. These are what you see in the management interface, and the top object that gets put in the ZODB. Note that only the top proxy object is persistent, the others are transient. The Nodes are pure DOM objects. From a ManageableNode, the DOM Node is retrieved with the getDOMObj() call.
The DOM tree created by Zope aims to comply with the DOM level 2 standard. This allows you to access your XML in DTML or External Methods using a standard and powerful API.
We are currently supporting the DOM level 2 Core and Traversal specifications.
The DOM tree is not built with the XML-SIG’s DOM package, because it requires significantly different node classes.
DOM attributes are made available according to the Python language mapping for the IDL interfaces described by the DOM recommendation; see the mapping
Parsed XML implements a NodePath system to create references to XML nodes (most commonly elements).
Currently, traversal uses an element’s index within its parent as an URL key. For example:
This URL traverses from an XML Document object with id myDoc to it’s first sub-element, to that element’s second sub-element to an acquired method with id myMethod.
DOM methods can also be used in URLs, for example:
XML Documents and subnodes are editable via the management interface. Documents and subtrees can be replaced by uploading XML files.
Security is handled at the document level. DOM attributes and methods are protected by the “Access contents information” permission. Subnodes will acquire security settings from the document.
We like to think that Parsed XML provides a flexible platform for using a DOM storage and extending that storage to do interesting things. See README.DOMProxy for an explanation of how we’re using this for Parsed XML.
We’ve included a comprehensive unit test suite to make testing for DOM compliance easier. See tests/README for details.
If you want to submit changes to Parsed XML, please use the test suite to make sure that your changes don’t break anything.
There are bugs in how multiple node references reflect the hierarchy above the node:
A reference to a subnode of a DOM document won’t reflect some hierarchy changes made on other references to the same node.
If two references to a node are created, and one is then reparented, the other reference won’t reflect the new parent. The parentNode attribute will be incorrect, for example, as well as the ownerDocument and ownerElement attributes.
A reference to a subnode of a DOM document can’t be properly stored as a persistent attribute of a ZODB object; it will lose hierarchy information about its parent as well.
Entity reference handling is not complete:
- Entity references do not have child nodes that mirror the child nodes of the referenced entity; they do not have child nodes at all.
- TreeWalker.expandEntityReferences has no effect, because of the above bug.
Traversal support for visibility and roots is not complete.
- Martijn Faassen <email@example.com>
- Patrick Decat <firstname.lastname@example.org>
- Tim Heap
The Zope Corporation Parsed XML team:
- Martijn Faassen <email@example.com>
Much test and implementation help was provided by
- Chris McDonough <firstname.lastname@example.org>
- Shane Hathaway <email@example.com>
- Guido van Rossum <firstname.lastname@example.org>
Parsed XML also contains code from versions of the original XMLDocument, written by Amos Latteier and Fourthought Inc.
ParsedXML contains software derived from works by Fourthought Inc; see LICENSE.Fourthought for their license.
- Update imports and syntax to work with Zope 2.12 and Python 2.6.
- Allow unicode characters in qualified names.
- Zope 2.8 transaction compatibility fixes.
Switched the test suite over to use ZopeTestCase.
Updated UI so that the XML edit screen is the first screen seen, not the DOM screen.
UI is UTF-8 by default now, the same as the default encoding of the XML. XML encoding setting in the XML declaration other than UTF-8 are supported by the upload, but the XML text will be converted to unicode internally and will display as UTF-8, the default encoding.
You can also use the encoding in the XML declaration in the edit entry, but you will be unlikely to get it right as the interactions between various encodings becomes rather complex. If you have XML text in a non-UTF-8 encoding upload it as a file rather than copy and paste it.
Switched over to new-style security declarations and added a few.
- Cleaned out tests so they all pass. This was done (unfortunately) by disabling some failing tests.
- Some encoding issues should be more sane now.
- Updated license to ZPL 2.0.
- get rid of unused HTML writing support.
- Removed old Printer.py module (PrettyPrinter works better and has more functionality).
- get rid of obsolete delegation through StrIO module to get StringIO.
- Bugfix/performance improvement. Do not rely on getPersistentDocument() but instead use acquisition parent. This fixes a memory leak triggered when doing document.documentElement, and also likely improves performance when accessing the DOM through the ManageableDOM wrappers.
- The Zope Find tab should now not give an error anymore when searching with ids.
- Use the One True Way to import expat now (‘from xml.parsers import expat’).
- All element nodes now have a _element_id id. This id is guaranteed to be unique in the document, though which id an element has may change in a reparse.
- NodePath system for creating various paths to nodes. This can be based on ‘child’ (child node index), ‘element_id’, or ‘robust’, which is not very robust as yet in some ways, but should be resistent to quite a few changes to the document.
- added pretty printing feature. A pretty print button renders the document in pretty printed form, but does not save this changed version (you can do so yourself).
- Removed ParsedXML’s Expat, introduced dependency on PyXML’s pyexpat instead (or just compile your own). This gets rid of lots of install hassles, especially on Windows. Just install PyXML.
- An ongoing attempt to bring sanity to the unit test story.
- Avoid XML-garbling bug in Printer by using PrettyPrinter.
- various bugfixes in the DOM.
- pyexpat.c is now in sync with Python’s and PyXML’s version.
- Tests are now more conformant with Zope unit testing guidelines.
- Should work with Python 2.1/Zope 2.4, but not all the way there yet (parser segfault..)
- new version of PyExpat
- Access to DOM from Zope environment sped up by new DOMProxy implementation
- Fixed an ExpatBuilder bug that caused a DOM reference to be leaked when parsing occured at the document. There is still a leak for fragment parsing that we’re looking into.
- ZCatalog support added
- ZCache support added
- Version numbers make more sense :)
- The value returned by get_size() is cached, which will often speed up the management view of an instance’s container.
- Problems with Attr Node manipulation not being reflected by the getAttribute methods of their Elements and vice versa fixed.
- Erroneous position information for parse error output on subnodes fixed.
- Default attributes are noticed by the parser and printer, and the relevent DOM methods work.
- ManageableDOM Nodes can find the persistent Document wrapper when it has been installed in a Zope ObjectManager. This object, rather than a newly created ManageableDocument wrapper, is returned when available by OwnerDocument calls. This allows Zopish navigation and discovery out of the Document, helps shorten acquisition paths, and fixes some bugs with manipulation at the Document.
- ManageableDOM’s usage of namespaces for parsing is now optional and settable.
- The DOM 2 Traversal interface has been fleshed out, although support for visiblity and roots is not complete.
- Yet a few more DOM bugs fixed.
- Fixed a distribution error that was causing build problems under Solaris.
- Many bugs found and fixed as we hammered out new DOM tests, especially in namespace usage, attribute printing, and attribute children.
- Several speedups throughout the code.
- ManageableDOM refactored into several base classes to make extension easier.