Xml2rfc generates RFCs and IETF drafts from document source in XML according to the dtd in RFC2629.
The IETF uses a specific format for the standards and other documents it publishes as RFCs, and for the draft documents which are produced when developing documents for publications. There exists a number of different tools to facilitate the formatting of drafts and RFCs according to the existing rules, and this tool, xml2rfc, is one of them. It takes as input an xml file which contains the text and meta-information about author names etc., and transforms it into suitably formatted output. The input xml file should follow the DTD given in RFC2629 (or it’s inofficial successor).
The current incarnation of xml2rfc provides output in the following formats: Paginated and unpaginated ascii text, html, nroff, and expanded xml. Only the paginated text format is currently (January 2013) accepable as draft submissions to the IETF.
To install a system-wide version of xml2rfc, download and unpack the xml2rfc distribution package, then cd into the resulting package directory and run:
$ python setup.py install
Alternatively, if you have the ‘pip’ command (‘Pip Installs Packages’) installed, you can run pip to download and install the package:
$ pip install xml2rfc
If you want to perform a local installation for a specific user, you have a couple of options. You may use python’s default location of user site-packages by specifying the flag --user. These locations are:
- UNIX: $HOME/.local/lib/python<ver>/site-packages
- OSX: $HOME/Library/Python/<ver>/lib/python/site-packages
- Windows: %APPDATA%/Python/Python<ver>/site-packages
You can additionally combine the flag --install-scripts with --user to specify a directory on your PATH to install the xml2rfc executable to. For example, the following command:
$ python setup.py install --user --install-scripts=$HOME/bin
will install the xml2rfc library and data to your local site-packages directory, and an executable python script xml2rfc to $HOME/bin.
The option --prefix allows you to specify the base path for all installation files. The setup.py script will exit with an error if your PYTHONPATH is not correctly configured to contain the library path the script tries to install to.
The command is used as follows:
$ python setup.py install --prefix=<path>
For further fine-tuning of the installation behavior, you can get a list of all available options by running:
$ python setup.py install --help
xml2rfc accepts a single XML document as input and outputs to one or more conversion formats.
Basic Usage: xml2rfc SOURCE [options] FORMATS...
The following parameters affect how xml2rfc behaves, however none are required.
Short Long Description -C --clear-cache purge the cache and exit -h --help show the help message and exit -n --no-dtd disable DTD validation step -N --no-network don’t use the network to resolve references -q --quiet dont print anything -v --verbose print extra information -V --version display the version number and exit -b BASENAME --basename=BASENAME specify the base name for output files -c CACHE --cache=CACHE specify an alternate cache directory to write to -D DATE --date=DATE run as if todays date is DATE (format: yyyy-mm-dd) -d DTD --dtd=DTD specify an alternate dtd file -o FILENAME --out=FILENAME specify an output filename
At least one but as many as all of the following output formats must be specified. The destination file will be created according to the argument given to –filename. If no argument was given, it will create the file(s) “output.format”. If no format is specified, xml2rfc will default to paginated text (--text).
Command Description --raw outputs to a text file, unpaginated --text outputs to a text file with proper page breaks --nroff outputs to an nroff file --html outputs to an html file --exp outputs to an XML file with all references expanded
- xml2rfc draft.xmlxml2rfc draft.xml --dtd=alt.dtd --basename=draft-1.0 --text --nroff --html
Version 2.6.2 (19 Jun 2017)
- Refactored the input file reading to accept files with Mac line endings, using python’s Universal Newline support. This should make xml2rfc deal correctly with input files following DOS, MAC and Linux line-ending conventions.
Version 2.6.1 (03 Jun 2017)
- Inialised the widow and orphan limit settings from PIs. Did some related refactoring.
- Added an option to show the known PIs, and their default values. Also commented out PIs for which there are no implementations from the internal PI list, and did some refactoring of the option parser setup.
- Changed a number of numeric constants related to page breaking which occured inline in the code, so that appropriate settings on the writer are used instead: self.page_end_blank_lines, self.orphan_limit, self.widow_limit. Some refactoring.
- Restored support for the quiet= argument to writers, as this is used by other tools that invoke writers, and backwards compatibility is desired.
- Added a mkrelease script.
- Limited the changelog on the pypi page to the 2 latest releases.
Version 2.6.0 (31 May 2017)
The implementation of the ‘authorship’ PI in the original TCL tool would suppress the Author’s Address section when set to “no”, while in the current implementation it removed author information on the first page. Changed to the original semantics. Also author organisation handling on the first page changed to use the submissionType setting to trigger the behaviour described in issue #311. Fixes issue #311 without overlaying this on the ‘authorship’ PI.
Added a check for the ‘needLines’ PI within lists.
Fixed a bug in the code for the ‘sectionorphans’ PI. Added a PI ‘tocpagebreak’ to force a page break before the ToC. This, together with the fix for #311 and needLines within lists, lets xml2rfc produce rfc7754.txt correctly from suitable xml without postprocessing.
Tweaked the eref output in text mode to avoid generating extraneous space characters. Fixes issue #329.
Merged in  from email@example.com: Changed to use the emph character in spanex so that the same thing happens in both html and text if an unknown attribute is given. Fixes issue #297”
Merged in  from firstname.lastname@example.org, with tweaks: Added code to emit sections in two sections, numbered and un-numbered, separately. Then emit the numbered appendixes, the index, the unnumbered appendixes, cref items, authors at the end of the document. Fixes issue #310.
Merged in  from email@example.com: If you have an xref or similar element in an annotation in a reference, any text that follows the xref is absent from the output HTML file. Text files emit correctly. Fixed the html generation.
Merged in  from firstname.lastname@example.org: The HTML rendering for <xref> elements were inconsistent with the text rendering. Fixed this by doing something completely different than is called for in the bug report:
We follow the layout of what the V3 HTML document says to do. This means that we use the child text of the xref when it exists to the exclusion of any generated text. When the child text does not exist then we use the synthesized text string as the text for the anchor element. In all cases the anchor element is emitted with an href of the target. Fixes issue #293.
Merged in  from email@example.com: Added true and false as legal values for the attribute numbered on a section.xml Fixes issue #313
Eliminated redundant PI parsing, now that each element carries the local PI settings.
Merged in patch from firstname.lastname@example.org, see ticket #307: Fixed a problem where if there are no authors, references in HTML are badly formatted. Fixes issue #307 and #309.
Merged in  from email@example.com, with some tweaks to make things work under python 3.x: Don’t split special terms with embedded forward slash on the slash character. Fixes issue #288. Also added code to deal with an extra tab in the middle of a sentence.
Changed the handling of PIs such that each element in the parsed xml tree holds the PI state at that point of the xml document. This provides the ability to use different PI settings at different points in the document. This only makes sense for some PIs, though. The following PIs will now be honoured if changed inside the document, in order to provide more flexibility: ‘multiple-initials’, ‘artworkdelimiter’, ‘compact’, ‘subcompact’, ‘text-list-symbols’, ‘colonspace’.
Honour the way double initials are given in the XML, with or without interleaved spaces. See issue #303, which says of multiple initials ‘… Expectation was that it would exactly match the initials attribute in the XML’
Merged in  from firstname.lastname@example.org: Enabled the multiple-initial PI again. The code now also looks for the PI as the first element of the author element, to apply for that author entry only, with a default of ‘no’.
Merged in  from email@example.com: Added handling for absent author initials for the html generator.
Merged in  from firstname.lastname@example.org: This commit provides support for multiple author initials. Fixes issue #303. Also fixes the issue of extra commas showing up when there are no initials, just a surname.
Merged in  from email@example.com: Changet to emit html not xhtml. Addresses issues #263 and #279.
Updated additional test masters needed to make the tox tests pass, and changed the html encoding and decoding to use utf-8, to work with the unicode and utf-8 tests.
Removed python 2.6 from tox texting (a previous commit added python 3.5).
Don’t let the value of ‘title’ be None, make it an empty string if that happens. Fixes issue #328
Someone might want to set hangIndent to zero. Test the value against None explicitly to permit this to succeed.
Added an –utf8 switch to xml2rfc. In nroff mode, the output will contain utf-8 characters, not [u8FD9] escapes; use groff with the -Kutf8 switch to process the resulting nroff.
Removed all references to xml.resource.org; it is not useful for fallback purposes any more.