Skip to main content

Convert Quark Xpress Tags to XML

Project description

Convert Quark Xpress tagged text (Tags, xtags) to XML.

This module is intended as a pre-processing step in the conversion of Quark Xpress tagged text to (semantic) HTML; as such it does not attempt to convert every single tag to XML, but only those that are relevant to the production of semantic, HTML5-compliant HTML.

This means that Paragraph and Character style sheet definitions are ignored; we only care about the apllied style sheet names. It also means that import character attributes like <i> and <b>, but ignore tags related to print typesseting only (tracking, kerning, baseline shift, etc.) by default , though they can be turned on.

The module doen’t actually produced XML but an Element Tree, in case you’d like to do further processing on the tree itself before serialising it with lxml.etree.tostring(). The serialised XML can then be turned to HTML with e.g. BeautifulSoup for postprocessing (for example, mapping Quark paragraph stylesheet to CSS classes, character tags and stylesheets to semantic HTML tags, rolling up indented quotes to <blockquote>, etc.)

Outputs UTF-8.

Usage:

>>> from quark_tagged_text import get_encoding, to_xml
>>> from lxml.etree import tostring
>>> encoding = get_encoding(<source file>)
>>> with open(<source file>, encoding=encoding) as tagged_text:
>>>     element_tree = to_xml(tagged_text)
>>> serialised_xml = tostring(element_tree, encoding='utf-8')

You can also call to_xml with a css=True argument. This will attempt to convert some character styles into inline CSS (works with fonts, small caps, uppercase, strikethrough).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

Quark_Xpress_Tags_to_xml-1.1-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file Quark_Xpress_Tags_to_xml-1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for Quark_Xpress_Tags_to_xml-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d5a76be9ce9c33c4cc8ca2d6a2eece1a997ecfd275085e8bc0020c4aa8a58b65
MD5 799e5f9056e16c8ad88f2151420c8504
BLAKE2b-256 c488d34f7f9c6adf2b8ec8bd1743fcb9f7f52d0b027856c171a5e512c227a7c8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page