Skip to main content

Makes working with XML feel like you are working with JSON

Project description

# xmltodict

`xmltodict` is a Python module that makes working with XML feel like you are working with [JSON](http://docs.python.org/library/json.html), as in this ["spec"](http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html):

[![Build Status](https://secure.travis-ci.org/martinblech/xmltodict.png)](http://travis-ci.org/martinblech/xmltodict)

```python
>>> doc = xmltodict.parse("""
... <mydocument has="an attribute">
... <and>
... <many>elements</many>
... <many>more elements</many>
... </and>
... <plus a="complex">
... element as well
... </plus>
... </mydocument>
... """)
>>>
>>> doc['mydocument']['@has']
u'an attribute'
>>> doc['mydocument']['and']['many']
[u'elements', u'more elements']
>>> doc['mydocument']['plus']['@a']
u'complex'
>>> doc['mydocument']['plus']['#text']
u'element as well'
```

## Namespace support

By default, `xmltodict` does no XML namespace processing (it just treats namespace declarations as regular node attributes), but passing `process_namespaces=True` will make it expand namespaces for you:

```python
>>> xml = """
... <root xmlns="http://defaultns.com/"
... xmlns:a="http://a.com/"
... xmlns:b="http://b.com/">
... <x>1</x>
... <a:y>2</a:y>
... <b:z>3</b:z>
... </root>
... """
>>> assert xmltodict.parse(xml, process_namespaces=True) == {
... 'http://defaultns.com/:root': {
... 'http://defaultns.com/:x': '1',
... 'http://a.com/:y': '2',
... 'http://b.com/:z': '3',
... }
... }
True
```

It also lets you collapse certain namespaces to shorthand prefixes, or skip them altogether:

```python
>>> namespaces = {
... 'http://defaultns.com/': None, # skip this namespace
... 'http://a.com/': 'ns_a', # collapse "http://a.com/" -> "ns_a"
... }
>>> assert xmltodict.parse(xml, namespaces=namespaces) == {
... 'root': {
... 'x': '1',
... 'ns_a:y': '2',
... 'http://b.com/:z': '3',
... },
... }
True
```

## Streaming mode

`xmltodict` is very fast ([Expat](http://docs.python.org/library/pyexpat.html)-based) and has a streaming mode with a small memory footprint, suitable for big XML dumps like [Discogs](http://discogs.com/data/) or [Wikipedia](http://dumps.wikimedia.org/):

```python
>>> def handle_artist(_, artist):
... print artist['name']
... return True
>>>
>>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),
... item_depth=2, item_callback=handle_artist)
A Perfect Circle
Fantômas
King Crimson
Chris Potter
...
```

It can also be used from the command line to pipe objects to a script like this:

```python
import sys, marshal
while True:
_, article = marshal.load(sys.stdin)
print article['title']
```

```sh
$ cat enwiki-pages-articles.xml.bz2 | bunzip2 | xmltodict.py 2 | myscript.py
AccessibleComputing
Anarchism
AfghanistanHistory
AfghanistanGeography
AfghanistanPeople
AfghanistanCommunications
Autism
...
```

Or just cache the dicts so you don't have to parse that big XML file again. You do this only once:

```sh
$ cat enwiki-pages-articles.xml.bz2 | bunzip2 | xmltodict.py 2 | gzip > enwiki.dicts.gz
```

And you reuse the dicts with every script that needs them:

```sh
$ cat enwiki.dicts.gz | gunzip | script1.py
$ cat enwiki.dicts.gz | gunzip | script2.py
...
```

## Roundtripping

You can also convert in the other direction, using the `unparse()` method:

```python
>>> mydict = {
... 'page': {
... 'title': 'King Crimson',
... 'ns': 0,
... 'revision': {
... 'id': 547909091,
... }
... }
... }
>>> print unparse(mydict)
<?xml version="1.0" encoding="utf-8"?>
<page><ns>0</ns><revision><id>547909091</id></revision><title>King Crimson</title></page>
```

## Ok, how do I get it?

You just need to

```sh
$ pip install xmltodict
```

There is an [official Fedora package for xmltodict](https://admin.fedoraproject.org/pkgdb/acls/name/python-xmltodict). If you are on Fedora or RHEL, you can do:

```sh
$ sudo yum install python-xmltodict
```

## Donate

If you love `xmltodict`, consider supporting the author [on Gittip](https://www.gittip.com/martinblech/).

Release history Release notifications

History Node

0.11.0

History Node

0.10.2

History Node

0.10.1

History Node

0.10.0

History Node

0.9.2

History Node

0.9.1

History Node

0.9.0

History Node

0.8.7

History Node

0.8.6

This version
History Node

0.8.5

History Node

0.8.4

History Node

0.8.3

History Node

0.8.2

History Node

0.8.1

History Node

0.7.0

History Node

0.6.0

History Node

0.5.1

History Node

0.5.0

History Node

0.4.6

History Node

0.4.4

History Node

0.4.3

History Node

0.4.2

History Node

0.4.1

History Node

0.4

History Node

0.3

History Node

0.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
xmltodict-0.8.5.tar.gz (34.0 kB) Copy SHA256 hash SHA256 Source None Feb 3, 2014

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page