Skip to main content

Xml2rfc generates RFCs and IETF drafts from document source in XML according to the IETF xml2rfc v2 and v3 vocabularies.

Project description


The IETF uses a specific format for the standards and other documents it publishes as RFCs, and for the draft documents which are produced when developing documents for publications. There exists a number of different tools to facilitate the formatting of drafts and RFCs according to the existing rules, and this tool, xml2rfc, is one of them. It takes as input an xml file which contains the text and meta-information about author names etc., and transforms it into suitably formatted output. The input xml file should follow the grammars in RFC7749 (for v2 documents) or RFC7991 (for v3 documents). Note that the grammar for v3 is still being refined, and changes will eventually be captured in the bis draft for 7991. Changes not yet captured can be seen in the xml2rfc source v3.rng.

xml2rfc provides a variety of output formats. See the command line help for a full list of formats. It also provides conversion from v2 to v3, and can run the preptool on its input.


Installation of the python package is done as usual with ‘pip install xml2rfc’, using appropriate switches and/or sudo.

Installation of support libraries for the PDF-formatter

In order to generate PDFs, xml2rfc uses the WeasyPrint module, which depends on external libaries that must be installed as native packages on your platform, separately from the xml2rfc install.

First, install the Cairo, Pango, and GDK-PixBuf library files on your system. See installation instructions on the WeasyPrint Docs:

(Python 3 is not needed if your system Python is 2.7, though).

(On some OS X systems with System Integrity Protection active, you may need to create a symlink from your home directory to the library installation directory (often /opt/local/lib):

ln -s /opt/local/lib ~/lib

in order for weasyprint to find the installed cairo and pango libraries. Whether this is needed or not depends on whether you used macports or homebrew to install cairo and pango, and the homebrew / macport version.)

Next, install the pycairo and weasyprint python modules using pip. Depending on your system, you may need to use ‘sudo’ or install in user-specific directories, using the –user switch. On OS X in particular, you may also need to install a newer version of setuptools using –user before weasyprint can be installed. If you install with the –user switch, you may need to also set PYTHONPATH, e.g.,


for Python 2.7.

The basic pip commands (modify as needed according to the text above) are:

pip install ‘pycairo>=1.18’ ‘weasyprint<=0.42.3’

With these installed and available to xml2rfc, the –pdf switch will be enabled.

For PDF output, you also need to install the Noto font set. Download the full set from, and install as appropriate for your platform.


xml2rfc accepts a single XML document as input and outputs to one or more conversion formats.

Basic Usage: xml2rfc SOURCE [options] FORMATS...

Run xml2rfc --help for a full listing of command-line options.


Version 2.23.0 (27 Jun 2019)

This release adds v2v3 support for conversion of v2 code with both text and external image sources to v3 <artset> format, and provides improved cache handling. It also contains a long list of bugfixes. Here is an excerpt from the commit log:

  • Fixed an issue where cache clearing did not consider custom cache locations.

  • Added retry on connection error for external includes. This fixes an issue which has appearing more often recently, where the first connection to fetch a reference file has failed to provide the right redirect.

  • Added inclusion of metadata.js in the html renderer, for future handling of dynamic metadata (updated-by and obsoleted-by information, for instance). Added a default instance of the metadata javascript file to the distribution, and added an command-line option to specify an alternative version of the javascript metadata script.

  • Added <script> to the list of elements treated as blocks for HTML output formatting purposes.

  • Show ‘US’ as “United States of America” (official name rather than short name according to ISO 3166-1)

  • Changed the default RFC base URL in, and the extension used in

  • Added code to the v2v3 converter to create an <artset> for legacy artwork with both a ‘src’ attribute and text content.

  • Changed <reference> rendering when part of a <referencegroup> to not include the DOI.

  • Fixed a crash that could occur during index building with multiple levels of <references>.

  • Tweaked the text format <artwork> placeholder (when no text format artwork is present) to look at both ‘src’ and ‘originalSrc’ for an URL for alternative artwork.

  • Changed the handling of pilcrow links at end of paragraphs and similar to follow immediately after the content, without wrappable space, to avoid the appearence of having double blandk lines when the pilcrow would be wrapped to sit by itself on a line.

  • Added a HTML div to hold artset anchor, fixing an issue where artset anchors would not always be present.

  • Refined the preptool warnings regarding artset/artwork anchor handling.

  • Eliminated toc update work when tocInclude is false.

  • Only apply validation of cache entries as references if they are <reference> entries.

  • Refined the reference URL cache handling for URLs with query arguments, to avoid cache collisions.

  • Eliminated an incorrect check for page break after section header at end of document. Fixes issue #409.

  • Handled authors without address elements for v3 text. Fixes issue #408.

  • Dealt better with <workgroup> without content.

  • Changed the schema to require at least one instance of <artwork> within <artset>. Fixes issue #405.

  • Added a default rendering (code point number) for code points without unicode code point names. Fixes issues #401 and #402.

Version 2.22.3 (08 Apr 2019)

This release brings further tweaks and imporvements to the v3 text format output in the area of Table of Content formatting, and fixes a number of bugs.

  • Tweaked the handling of ToC page numbers in the v3 text format.

  • Tweaked the xml inserted by the preptool for the ToC to give ToC indentation and spacing which better match the legacy text format (and also looks better).

  • Added a rewrite of <svg> viewBox values to the simplest acceptable format, to make sure it will be understood by our pdf generation libs.

  • Added a test case for <xref section=…>

  • Tweaked the section label for fragment <xref> rendering to say ‘Appendix’ instead of ‘Section’ for appendix references.

  • Added a pre-rfc1272 reference to elements.xml to test the author initials handling for early RFCs.

  • Tweaked the get_initials() code for use on <reference> authors. Refactored part of the text.render_reference() code to support get_initials() properly.

  • Added special initials handling for RFCs 1272 and below, to apply the single initials handling enforced at that time.

Version 2.22.2 (27 Mar 2019)

This release fixes a couple of issues discovered by users who are now starting to excercise the v3 processing chain. Thanks for reporting bugs! From the commit log:

  • Fixed an issue with xref rendering where the ‘pageno’ parameter had been given a non-numeric default value. Fixes issue #399.

  • Removed an unnecessary docker volume binding.

  • Added slightly more verbosity when converting v2v3, and tweaked an import statement.

  • Fixed an issue with <references> sections where duplicate name entries could be created during the prep phase, leading to schema validation failure. Fixes issue #398.

Project details

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xml2rfc-2.23.0.tar.gz (3.9 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page