Skip to main content

Fast XML flattening library with Python bindings

Project description

fast-xml-flattener

PyPI Python License: MIT Rust CI codecov

Flatten nested XML into CSV, JSON, Parquet, or Python dicts — in milliseconds, not seconds.

fast-xml-flattener is a Rust-powered Python library that converts XML documents into flat, analysis-ready representations. It uses a zero-copy streaming parser and builds output structures in a single tree walk, with no intermediate serde_json::Value or DOM allocation. The result: throughput that leaves pure-Python parsers far behind.


Why fast-xml-flattener?

XML → flat dict (median of 7 runs, CPython 3.13)

Library 0.5 MB 5.4 MB 27 MB
fast-xml-flattener 11 ms 225 ms 1 089 ms
lxml + manual flatten 27 ms 407 ms 2 108 ms
xmltodict + manual flatten 63 ms 997 ms 4 952 ms

XML → flat JSON string (median of 7 runs)

Library 0.5 MB 5.4 MB 27 MB
fast-xml-flattener 13 ms 164 ms 884 ms
xmltodict + json.dumps 93 ms 1 147 ms 5 374 ms

Dell Vostro i7-1260P, 64 GB RAM, Linux, CPython 3.13. Synthetic XML with nested records (id, user, address, order fields). See benches/benchmark.py.

4–7× faster than xmltodict, 2–2.5× faster than lxml across all tested sizes. The gap widens with document size because the Rust parser operates at memory-bandwidth speed with zero DOM allocation. The GIL is held only for dict-returning functions (to_dict, to_flatten_dict); all other outputs release it entirely, making the library safe to use from thread pools.


Features

  • Flatten nested XML into JSON, flatten-JSON, native Python dict, flatten-dict, CSV, or Parquet
  • Dot-notation object access — navigate parsed XML like obj.user.address.city with XmlObject
  • Single-pass streaming parser — no DOM, no intermediate Value allocation
  • GIL-free for string/CSV/Parquet outputs — safe to use from thread pools
  • xmltodict-compatible semantics: @attr, #text, auto-list for repeated tags
  • Namespace stripping, CDATA, entity references, comments — all handled correctly
  • Supports Python 3.10+

Output Formats

Function Returns Description
to_json(xml) str 1:1 JSON preserving XML structure (@attr, #text)
to_flatten_json(xml, separator=".") str Flat JSON with dot-notation keys (user.address.city)
to_dict(xml) dict 1:1 nested Python dict — built directly in Rust, no JSON round-trip
to_flatten_dict(xml, separator=".") dict Flat Python dict with dot-notation keys
to_csv(xml, include_attrs=True) str Tabular CSV, one row per XML record
to_parquet(xml, path, include_attrs=True) None Columnar Parquet file for big-data workflows
to_object(xml) XmlObject Dot-notation Python object with attribute and text access

Installation

pip install fast-xml-flattener

Quick Start

import fast_xml_flattener as fxf

xml = """
<root>
  <user>
    <id>1</id>
    <name>Alice</name>
    <address>
      <city>Warsaw</city>
      <zip>00-001</zip>
    </address>
  </user>
</root>
"""

# 1:1 JSON string — preserves nesting
result = fxf.to_json(xml)
# '{"user": {"id": "1", "name": "Alice", "address": {"city": "Warsaw", "zip": "00-001"}}}'

# Flattened JSON string with dot-notation keys
flat = fxf.to_flatten_json(xml)
# '{"user.id": "1", "user.name": "Alice", "user.address.city": "Warsaw", "user.address.zip": "00-001"}'

# Native Python dict (1:1 nested) — no JSON round-trip
d = fxf.to_dict(xml)
print(d["user"]["name"])             # Alice
print(d["user"]["address"]["city"])  # Warsaw

# Flattened native Python dict
fd = fxf.to_flatten_dict(xml, separator=".")
print(fd["user.address.city"])       # Warsaw

# CSV — one row per <user> element
csv = fxf.to_csv(xml, include_attrs=True)

# Parquet — ready for pandas / Spark / DuckDB
fxf.to_parquet(xml, path="output.parquet", include_attrs=True)

# Dot-notation object access
obj = fxf.to_object(xml)
print(obj.root.user.name)              # Alice
print(obj.root.user.address.city)      # Warsaw

XmlObject — dot-notation access

to_object() parses XML and returns an XmlObject that wraps the result of to_dict(). XML parsing is done in Rust; the object layer adds minimal Python overhead.

xml = '''
<catalog>
  <book id="1" lang="en">
    <title>Clean Code</title>
    <author>Robert C. Martin</author>
  </book>
  <book id="2" lang="pl">
    <title>Czysty Kod</title>
    <author>Robert C. Martin</author>
  </book>
</catalog>
'''

obj = fxf.to_object(xml)

# Navigate nested structure with dot notation
books = obj.catalog.book          # list of XmlObject (repeated tag)
print(books[0].title)             # Clean Code
print(books[1].title)             # Czysty Kod

# Access XML attributes via _attrs (no @ prefix)
print(books[0]._attrs)            # {"id": "1", "lang": "en"}
print(books[0]._attrs["lang"])    # en

# Access text content via _text (useful when element has both text and attrs)
print(books[0].title._text)       # Clean Code

# Get the underlying raw dict via .raw
print(books[0].raw)               # {"@id": "1", "@lang": "en", "title": "Clean Code", ...}
Property / access Returns Description
obj.child_tag XmlObject, list[XmlObject], or str Child element; list when tag repeats; str for pure-text leaves
obj._attrs dict[str, str] XML attributes of this element (keys without @ prefix)
obj._text str | None Text content (#text) of this element
obj.raw dict | str Underlying value from to_dict() — str for pure-text leaves

Loading Parquet with pandas

import pandas as pd

df = pd.read_parquet("output.parquet")
print(df.head())

Using with DuckDB

import duckdb

duckdb.sql("SELECT * FROM 'output.parquet'").show()

Development

Requirements

  • Python 3.10+ (3.13 recommended for development)
  • Rust (stable)
  • maturin

Setup with pyenv (recommended)

# Install pyenv: https://github.com/pyenv/pyenv
pyenv install 3.13
pyenv local 3.13

# Create and activate virtual environment
pyenv virtualenv 3.13 xml-flattener
pyenv activate xml-flattener

# Install uv and dev dependencies
pip install uv
uv pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Setup without pyenv

python -m venv venv
source venv/bin/activate
pip install uv
uv pip install -e ".[dev]"
pre-commit install

Build

uv run maturin develop   # development build
maturin build --release  # release wheel

Tests

uv run pytest            # Python integration tests (52 cases)
cargo test               # Rust unit tests (25 cases)
uv run ruff check .      # linting
cargo clippy --all-targets -- -D warnings  # Rust linting

Releasing

Releases are fully automated. Append one of these tags anywhere in your commit message (or PR title when squash-merging) to trigger a release:

Tag Bump Example
[fix] patch (0.1.0 → 0.1.1) fix null value in CSV output [fix]
[minor] minor (0.1.0 → 0.2.0) add streaming API [minor]
[major] major (0.1.0 → 1.0.0) redesign public API [major]

The release pipeline then:

  1. Bumps version in Cargo.toml and pyproject.toml
  2. Prepends an entry to CHANGELOG.md
  3. Commits (chore: bump version to X.Y.Z) and creates a vX.Y.Z git tag
  4. Builds wheels for Linux x86_64/aarch64, macOS universal2, Windows x86_64
  5. Publishes to PyPI via OIDC trusted publishing (no secrets needed)
  6. Creates a GitHub Release with the changelog entry and wheel artifacts

One-time PyPI setup (trusted publishing)

  1. Go to PyPI → Your projects → fast-xml-flattener → Publishing → Add a publisher
  2. Set: GitHub owner andree0, repo fast-xml-flattener, workflow release.yml, environment pypi
  3. On GitHub: Settings → Environments → New environment named pypi

No API tokens or secrets are required — OIDC handles authentication.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast_xml_flattener-0.1.6.tar.gz (39.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fast_xml_flattener-0.1.6-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

fast_xml_flattener-0.1.6-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ ARM64

fast_xml_flattener-0.1.6-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

fast_xml_flattener-0.1.6-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ ARM64

fast_xml_flattener-0.1.6-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.13tmanylinux: glibc 2.17+ ARM64

fast_xml_flattener-0.1.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

fast_xml_flattener-0.1.6-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

fast_xml_flattener-0.1.6-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (2.5 MB view details)

Uploaded CPython 3.13macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

fast_xml_flattener-0.1.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

fast_xml_flattener-0.1.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

fast_xml_flattener-0.1.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

fast_xml_flattener-0.1.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

fast_xml_flattener-0.1.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

fast_xml_flattener-0.1.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

File details

Details for the file fast_xml_flattener-0.1.6.tar.gz.

File metadata

  • Download URL: fast_xml_flattener-0.1.6.tar.gz
  • Upload date:
  • Size: 39.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.13.1

File hashes

Hashes for fast_xml_flattener-0.1.6.tar.gz
Algorithm Hash digest
SHA256 46e7d1aa15a924ad3812ba8a2d5d59fce1a669d3e495215a1c154f005c2c52ce
MD5 b54dc1b5e5e5b1cc83ea53b2143a2ea3
BLAKE2b-256 cebb21a2410c1d846a1a997cb56eff1bbf3f51223c91b15df1502336da95fc43

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2d402866e729f9ef61d5497ab266cb909e4bdc9abdb34ee3ad81f1f01e9b8947
MD5 133c748e429a7df3b1d826fbbf41d390
BLAKE2b-256 2ce758ed30cb015913799c24a804d656d14f0c83053ea9d4d32c51510f055928

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-pp311-pypy311_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-pp311-pypy311_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e870cfe8d5a99d574440b6290e8db30edc5a9b8975a1c187d4637a8cf122302e
MD5 f38b3abf62d1c457e54a9c927bf520c7
BLAKE2b-256 a31fe8297bf7d6b4cb152568f17b9c6c5933f1ed391861d55ff01d2fb83ddaba

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ea868c7d10801e3117fd0dfbcbb2ad8476abf9d3a3716a662da4e922298854b9
MD5 5c9496cac21e90a7a577504de9e462df
BLAKE2b-256 16c274346e3a9ef21f903bd74e09326135d440e243d400c46e49fd91db6851a1

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d253a8b4eae4365a86363b9ce8fac3cb7308b7153c268c825e5769c94ed69e9a
MD5 ae2b29f4ba20b6f89517af8c42d04bb7
BLAKE2b-256 35efefb09398e615096e11a079300ce95ea6f2ce2149bee1bc55eb239d230471

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 57234a97f0b944d020fb7920bc55f3043a62137ac06b33a8e526fc7e123757f0
MD5 a29dee6f56136849dde9bf83cdf916f4
BLAKE2b-256 5a094453974d3242dfcdd54ec39d4892a6671911bb446f3cf1d51c17126871cf

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ebd2406aaf897bf36570a73de047dbbeb5710ac1d8cfc28907c9a15cf847e74f
MD5 aacf22d8da2f2f6cde6fc5ab2bc25430
BLAKE2b-256 21e27ffd2656055e8f6c206b9df71185a450950cebb921328c1babb7c6bb7b6f

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e785c3505a167c53f6b9f7495fc91e14095e6d8e65da26e8fa2d51be049cde83
MD5 0b94fb71f5389a744e6965983841ca18
BLAKE2b-256 a79beb09c0a15ac230e9c73e787db8fffb344fed03ba88a1c5aea536afd0974a

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 1b7b9c23c1bda30c80e841406e9736ae44117c6ff26d8584ffb3a0b0f9fd6cc9
MD5 2bd342d1a7d653621dbd241e6bb57806
BLAKE2b-256 f438c0fbe7a9b3ebde2013caff51729663e4133130b085de39ac35f2d0b9a97d

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 e29713f7efe55834e3af082b44ff76027579676506c31c93631cd3b141522e83
MD5 03429c0ee335bb8dc99aa13e62a49783
BLAKE2b-256 c33e295334cf8daa42e816c88e983c28b7b41ad7174369d4c3524558f816fb6a

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d91cd327e5fcc4dfd5f26adcf4230cadef4a320da7772067792426db6a0bc136
MD5 64562c773bef83507c71151e98a609ca
BLAKE2b-256 7bb43b6074dac3f6b10f700a80fc8afe99d4747424f924181e6593e076840ee1

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e48258df23bd7d90b72c68c90c5a1e7f4a2aa1cebceb0145e5c99b84fdfb550d
MD5 c9c0adf7cd8fdea34dd64ef66dcea75d
BLAKE2b-256 6bb15f3ac6842406dba4e8b733fcaa1ff803674a10f65720e3b15e17af1312eb

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cb6bc59031feaa9e5e3e7b2d16e5a864efaad4d4aee8a025ac0293701508bc1f
MD5 c622127e0399794f2ad32d12641e7f60
BLAKE2b-256 b5e16ddec45c48bc062a832e32c145e621685e42685e4e72e98e93924fbc5dca

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7c3255292032dd358cc3536cb7f37ab60306a4afbbc1b91d0f6bc6486301f3b9
MD5 e4b134c198f3935ebdfacef3cc4a95d0
BLAKE2b-256 eafae2ec5de2e3f304de2700ea7374242dd5499664ae77d0630778c14ca82e01

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 715b5d6eb2505691e29f1f42e1c131ce3b48b68f44f53a93dd4dc508607c80cb
MD5 103209f4417b98306216f441967d7e71
BLAKE2b-256 8bd3ed9af069a5f6b0fd2a8a9f434028ae3b18565fbca147d95d58baf106145e

See more details on using hashes here.

File details

Details for the file fast_xml_flattener-0.1.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fast_xml_flattener-0.1.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9d55a4d0edc4f42794817a35ed2c2bf320f540f80e042743b2ac482589020b02
MD5 0350a715d3f15ba698f20c51e2f46656
BLAKE2b-256 f357efc51adeb1390a58dac3a5b0b525eb0ac10ea6524d40dfd97d0cf97fd061

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page