Skip to main content

Pretty-print XML and HTML files in a light, YAML-like, readable format

Project description

Unxml

Simplify and "flatten" XML files into a YAML-like readable format.

This is a Rust clone of the original unxml F# tool.

See it in action → — a gallery of real-world XML documents, schemas, stylesheets, and Schematron rules rendered with unxml, with original-vs-rendered size comparisons.

Installation

Using uv (Easiest)

Install the published wheel from PyPI as a standalone tool:

uv tool install unxml-rs

This puts the unxml command on your PATH. To try it without installing anything:

uvx --from unxml-rs unxml <xml_file>

Pre-built Binaries (Recommended)

Download the latest release for your platform from the GitHub Releases page:

  • Linux (x86_64): unxml-linux-x86_64.tar.gz
  • Windows (x86_64): unxml-windows-x86_64.zip
  • macOS (Intel): unxml-macos-x86_64.tar.gz
  • macOS (Apple Silicon): unxml-macos-arm64.tar.gz

Extract the archive and place the unxml binary in your PATH.

From Source

git clone https://github.com/yourusername/unxml-rs
cd unxml-rs
cargo install --path .

Using Cargo

cargo install unxml

Usage

unxml <xml_file>

By default files render as plain XML. Pass --auto to pick the processing mode from each file's extension:

Extension Mode applied
.xsl .xslt --xslt
.sch --schematron
.xsd --xsd

An explicit mode flag (--xslt, --schematron, --xsd, --special) always overrides autodetection.

Each mode rewrites its vocabulary into a terser pseudocode. The full set of transformations, with side-by-side samples, is documented per format:

Syntax-highlighted output (--bat)

unxml --bat some.xsd      # implies --auto (detects --xsd), pipes through `bat -l unxml`

--bat renders the output through bat using the bundled unxml grammar (see editor/) for paged, colourised display. If bat is not installed it falls back to plain stdout.

Hiding noisy namespace prefixes (--hide-ns)

Vocabularies like UBL bury the signal under repeated prefixes (cbc:, cac:). --hide-ns drops the named prefixes from element names — and their xmlns: declarations — so the output reads as bare local names:

unxml --hide-ns cbc,cac invoice.xml   # repeatable and comma-separated

Signal-carrying prefixes you don't list (e.g. ext:, bim:) are kept, so an extension subtree still stands out.

Under --auto/--bat, unxml also sniffs the document type and hides a sensible set automatically. Currently it recognises UBL instance documents (an unprefixed root such as <Invoice> in a UBL namespace) and hides whichever prefixes are bound to the Common Basic/Aggregate Components namespaces. A stylesheet or schema that merely references UBL (e.g. an xsl:stylesheet translating to UBL) is left untouched, since there the prefixes are real syntax.

Introduction

This command line application was developed for comparing XML files (e.g. database/application state dumps). It takes an XML file and converts it to a YAML-like syntax that is easier to read and compare.

Example

Given this XML input:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>netcoreapp2.1</TargetFramework>
    <PackAsTool>true</PackAsTool>
    <Description>Unxml 'pretty-prints' xml files in light, yamly, readable format</Description>
    <PackageVersion>1.0.0</PackageVersion>
  </PropertyGroup>
  <ItemGroup>
    <Compile Include="FileSystemHelper.fs"/>
    <Compile Include="MutableCol.fs"/>
    <Compile Include="Program.fs" />
  </ItemGroup>
</Project>

The output would be:

Project
  [Sdk]: Microsoft.NET.Sdk
  PropertyGroup
    OutputType = Exe
    TargetFramework = netcoreapp2.1
    PackAsTool = true
    Description = Unxml 'pretty-prints' xml files in light, yamly, readable format
    PackageVersion = 1.0.0
  ItemGroup
    Compile
      [Include]: FileSystemHelper.fs
    Compile
      [Include]: MutableCol.fs
    Compile
      [Include]: Program.fs

Key Features

  • Attributes in Square Brackets: Element attributes are displayed as [attribute]: value
  • Text Content with Equals: Element text content is shown as ElementName = text content
  • Hierarchical Indentation: Nested elements are properly indented
  • Clean Format: Easy to read and compare, great for diffing
  • Inline mixed content: Prose interleaved with short inline elements stays on one readable line

Mixed content (prose with inline spans)

Document-style XML interleaves text with small inline elements — a paragraph containing a <command> or a <link>. Flattening every run onto its own line makes such prose hard to read, so unxml keeps it inline as one line of verbatim XML:

<para>The <command>widget</command> daemon keeps its
  <link href="recovery.html">recoverable</link> state in one database.</para>

renders as:

para = The <command>widget</command> daemon keeps its <link href="recovery.html">recoverable</link> state in one database.

An element flows inline when its whole subtree is inline-safe — text interleaved with elements that are themselves inline-safe. A leaf with significant (multi-line) text, such as <programlisting> or <screen>, is not inline-safe, so its parent stays in the flattened block form and the listing keeps its line breaks. Nested inline markup (e.g. <emphasis> wrapping a <command>) collapses all the way up. This applies to the generic XML render; the --xslt/--xsd/--wsdl/--schematron modes use their own formatting.

Technical Details

  • Built with Rust for performance and safety
  • Uses quick-xml for fast XML parsing
  • Uses clap for command-line argument parsing
  • Proper error handling with anyhow

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Creating Releases

The version lives in the git tag, not in Cargo.toml (which stays at the 0.0.0-dev placeholder; the release workflow injects the real version with cargo set-version). Do not bump Cargo.toml or create tags by hand.

To cut a release, let gh create the tag:

gh release create vX.Y.Z --title "Release vX.Y.Z" --notes "…"

The pushed tag triggers the GitHub Actions workflow, which builds binaries and the PyPI wheel for all platforms and attaches them to the release.

The CI workflow runs on every push to ensure code quality with formatting checks, linting, and tests.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

unxml_rs-1.4.0-py3-none-win_amd64.whl (809.0 kB view details)

Uploaded Python 3Windows x86-64

unxml_rs-1.4.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (996.9 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

unxml_rs-1.4.0-py3-none-macosx_11_0_arm64.whl (905.7 kB view details)

Uploaded Python 3macOS 11.0+ ARM64

unxml_rs-1.4.0-py3-none-macosx_10_12_x86_64.whl (924.9 kB view details)

Uploaded Python 3macOS 10.12+ x86-64

File details

Details for the file unxml_rs-1.4.0-py3-none-win_amd64.whl.

File metadata

  • Download URL: unxml_rs-1.4.0-py3-none-win_amd64.whl
  • Upload date:
  • Size: 809.0 kB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for unxml_rs-1.4.0-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 98282a1db4213233398bb87ce3e0641b7e8ad8b62e0033fcb5acc73df4b6cafa
MD5 1dcb233ed12e1ddc7e0870149b3b0e4f
BLAKE2b-256 4c17a6df209188458f794675327a8fc01b1029ea40f633643ced7e7bf2c6533f

See more details on using hashes here.

Provenance

The following attestation bundles were made for unxml_rs-1.4.0-py3-none-win_amd64.whl:

Publisher: release.yml on vivainio/unxml-rs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file unxml_rs-1.4.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for unxml_rs-1.4.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0694edab88839dd74d55ac7ff5db357f973c7e828adf587b417768852999230f
MD5 16028dc2512752a8d74d47ecccfda59f
BLAKE2b-256 d2fcdb4e694f2f7b1dee75f5f08c284d69d4921fc1f99dafcd5a13f35503f21a

See more details on using hashes here.

Provenance

The following attestation bundles were made for unxml_rs-1.4.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on vivainio/unxml-rs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file unxml_rs-1.4.0-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for unxml_rs-1.4.0-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 52c9ff5a0aad453df2c5e0211ae7827b007dd4a503b17ce3742d7495fca8073e
MD5 185ba06dbe1383400e1fc6457f3fce39
BLAKE2b-256 2448917f1a1294b50366f6dcf107bf76ab4436d0e761a8e19bb14fb83ac826bf

See more details on using hashes here.

Provenance

The following attestation bundles were made for unxml_rs-1.4.0-py3-none-macosx_11_0_arm64.whl:

Publisher: release.yml on vivainio/unxml-rs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file unxml_rs-1.4.0-py3-none-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for unxml_rs-1.4.0-py3-none-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 51151dc8232634ec27332ecee3e4c76734ea6c99bba1345558ac226a37242396
MD5 35cf934afae18c937c03132a019610e0
BLAKE2b-256 2eb048b412bbfcd7be92aee48407247ff2d85badc5822037706dbf8465e04811

See more details on using hashes here.

Provenance

The following attestation bundles were made for unxml_rs-1.4.0-py3-none-macosx_10_12_x86_64.whl:

Publisher: release.yml on vivainio/unxml-rs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page