Makes working with XML feel like you are working with JSON

These details have not been verified by PyPI

Project links

Homepage

Project description

xmltodict

xmltodict is a Python module that makes working with XML feel like you are working with JSON, as in this "spec":

>>> print(json.dumps(xmltodict.parse("""
...  <mydocument has="an attribute">
...    <and>
...      <many>elements</many>
...      <many>more elements</many>
...    </and>
...    <plus a="complex">
...      element as well
...    </plus>
...  </mydocument>
...  """), indent=4))
{
    "mydocument": {
        "@has": "an attribute", 
        "and": {
            "many": [
                "elements", 
                "more elements"
            ]
        }, 
        "plus": {
            "@a": "complex", 
            "#text": "element as well"
        }
    }
}

Namespace support

By default, xmltodict does no XML namespace processing (it just treats namespace declarations as regular node attributes), but passing process_namespaces=True will make it expand namespaces for you:

>>> xml = """
... <root xmlns="http://defaultns.com/"
...       xmlns:a="http://a.com/"
...       xmlns:b="http://b.com/">
...   <x>1</x>
...   <a:y>2</a:y>
...   <b:z>3</b:z>
... </root>
... """
>>> xmltodict.parse(xml, process_namespaces=True) == {
...     'http://defaultns.com/:root': {
...         'http://defaultns.com/:x': '1',
...         'http://a.com/:y': '2',
...         'http://b.com/:z': '3',
...     }
... }
True

It also lets you collapse certain namespaces to shorthand prefixes, or skip them altogether:

>>> namespaces = {
...     'http://defaultns.com/': None, # skip this namespace
...     'http://a.com/': 'ns_a', # collapse "http://a.com/" -> "ns_a"
... }
>>> xmltodict.parse(xml, process_namespaces=True, namespaces=namespaces) == {
...     'root': {
...         'x': '1',
...         'ns_a:y': '2',
...         'http://b.com/:z': '3',
...     },
... }
True

Streaming mode

xmltodict is very fast (Expat-based) and has a streaming mode with a small memory footprint, suitable for big XML dumps like Discogs or Wikipedia:

>>> def handle_artist(_, artist):
...     print(artist['name'])
...     return True
>>> 
>>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),
...     item_depth=2, item_callback=handle_artist)
A Perfect Circle
Fantômas
King Crimson
Chris Potter
...

It can also be used from the command line to pipe objects to a script like this:

import sys, marshal
while True:
    _, article = marshal.load(sys.stdin)
    print(article['title'])

$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | myscript.py
AccessibleComputing
Anarchism
AfghanistanHistory
AfghanistanGeography
AfghanistanPeople
AfghanistanCommunications
Autism
...

Or just cache the dicts so you don't have to parse that big XML file again. You do this only once:

$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | gzip > enwiki.dicts.gz

And you reuse the dicts with every script that needs them:

$ gunzip enwiki.dicts.gz | script1.py
$ gunzip enwiki.dicts.gz | script2.py
...

Roundtripping

You can also convert in the other direction, using the unparse() method:

>>> mydict = {
...     'response': {
...             'status': 'good',
...             'last_updated': '2014-02-16T23:10:12Z',
...     }
... }
>>> print(unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<response>
	<status>good</status>
	<last_updated>2014-02-16T23:10:12Z</last_updated>
</response>

Text values for nodes can be specified with the cdata_key key in the python dict, while node properties can be specified with the attr_prefix prefixed to the key name in the python dict. The default value for attr_prefix is @ and the default value for cdata_key is #text.

>>> import xmltodict
>>> 
>>> mydict = {
...     'text': {
...         '@color':'red',
...         '@stroke':'2',
...         '#text':'This is a test'
...     }
... }
>>> print(xmltodict.unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<text stroke="2" color="red">This is a test</text>

Lists that are specified under a key in a dictionary use the key as a tag for each item. But if a list does have a parent key, for example if a list exists inside another list, it does not have a tag to use and the items are converted to a string as shown in the example below. To give tags to nested lists, use the expand_iter keyword argument to provide a tag as demonstrated below. Note that using expand_iter will break roundtripping.

>>> mydict = {
...     "line": {
...         "points": [
...             [1, 5],
...             [2, 6],
...         ]
...     }
... }
>>> print(xmltodict.unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<line>
        <points>[1, 5]</points>
        <points>[2, 6]</points>
</line>
>>> print(xmltodict.unparse(mydict, pretty=True, expand_iter="coord"))
<?xml version="1.0" encoding="utf-8"?>
<line>
        <points>
                <coord>1</coord>
                <coord>5</coord>
        </points>
        <points>
                <coord>2</coord>
                <coord>6</coord>
        </points>
</line>

Ok, how do I get it?

Using pypi

You just need to

$ pip install xmltodict

Using conda

For installing xmltodict using Anaconda/Miniconda (conda) from the conda-forge channel all you need to do is:

$ conda install -c conda-forge xmltodict

RPM-based distro (Fedora, RHEL, …)

There is an official Fedora package for xmltodict.

$ sudo yum install python-xmltodict

Arch Linux

There is an official Arch Linux package for xmltodict.

$ sudo pacman -S python-xmltodict

Debian-based distro (Debian, Ubuntu, …)

There is an official Debian package for xmltodict.

$ sudo apt install python-xmltodict

FreeBSD

There is an official FreeBSD port for xmltodict.

$ pkg install py36-xmltodict

openSUSE/SLE (SLE 15, Leap 15, Tumbleweed)

There is an official openSUSE package for xmltodict.

# Python2
$ zypper in python2-xmltodict

# Python3
$ zypper in python3-xmltodict

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.14.2

Oct 16, 2024

0.14.1

Oct 9, 2024

0.14.0 yanked

Oct 8, 2024

0.13.0

May 8, 2022

0.12.0

Feb 11, 2019

0.11.0

Apr 27, 2017

0.10.2

Jun 2, 2016

0.10.1

Feb 23, 2016

0.10.0

Feb 23, 2016

0.9.2

Feb 4, 2015

0.9.1

Jan 18, 2015

0.9.0

Apr 17, 2014

0.8.7

Mar 27, 2014

0.8.6

Feb 16, 2014

0.8.5

Feb 3, 2014

0.8.4

Feb 3, 2014

0.8.3

Oct 21, 2013

0.8.2

Oct 21, 2013

0.8.1

Oct 12, 2013

0.7.0

Aug 25, 2013

0.6.0

Aug 19, 2013

0.5.1

Jul 15, 2013

0.5.0

May 25, 2013

0.4.6

Mar 2, 2013

0.4.4

Jan 24, 2013

0.4.3

Jan 11, 2013

0.4.2

Jan 4, 2013

0.4.1

Dec 20, 2012

0.4

Dec 13, 2012

0.3

Nov 14, 2012

0.2

Aug 28, 2012

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xmltodict-0.14.2.tar.gz (51.9 kB view details)

Uploaded Oct 16, 2024 Source

Built Distribution

xmltodict-0.14.2-py2.py3-none-any.whl (10.0 kB view details)

Uploaded Oct 16, 2024 Python 2Python 3

File details

Details for the file xmltodict-0.14.2.tar.gz.

File metadata

Download URL: xmltodict-0.14.2.tar.gz
Upload date: Oct 16, 2024
Size: 51.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for xmltodict-0.14.2.tar.gz
Algorithm	Hash digest
SHA256	`201e7c28bb210e374999d1dde6382923ab0ed1a8a5faeece48ab525b7810a553`
MD5	`6e0d94bf858b3c2ff3daeed487eedc2a`
BLAKE2b-256	`500551dcca9a9bf5e1bce52582683ce50980bcadbc4fa5143b9f2b19ab99958f`

See more details on using hashes here.

File details

Details for the file xmltodict-0.14.2-py2.py3-none-any.whl.

File metadata

Download URL: xmltodict-0.14.2-py2.py3-none-any.whl
Upload date: Oct 16, 2024
Size: 10.0 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for xmltodict-0.14.2-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`20cc7d723ed729276e808f26fb6b3599f786cbc37e06c65e192ba77c40f20aac`
MD5	`f745d92f448a40001945254dc818118e`
BLAKE2b-256	`d645fc303eb433e8a2a271739c98e953728422fa61a3c1f36077a49e395c972e`

See more details on using hashes here.

xmltodict 0.14.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

xmltodict

Namespace support

Streaming mode

Roundtripping

Ok, how do I get it?

Using pypi

Using conda

RPM-based distro (Fedora, RHEL, …)

Arch Linux

Debian-based distro (Debian, Ubuntu, …)

FreeBSD

openSUSE/SLE (SLE 15, Leap 15, Tumbleweed)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes