Skip to main content

Makes working with XML feel like you are working with JSON

Project description

# xmltodict

`xmltodict` is a Python module that makes working with XML feel like you are working with [JSON](http://docs.python.org/library/json.html), as in this ["spec"](http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html):

[![Build Status](https://secure.travis-ci.org/martinblech/xmltodict.png)](http://travis-ci.org/martinblech/xmltodict)

```python
>>> doc = xmltodict.parse("""
... <mydocument has="an attribute">
... <and>
... <many>elements</many>
... <many>more elements</many>
... </and>
... <plus a="complex">
... element as well
... </plus>
... </mydocument>
... """)
>>>
>>> doc['mydocument']['@has']
u'an attribute'
>>> doc['mydocument']['and']['many']
[u'elements', u'more elements']
>>> doc['mydocument']['plus']['@a']
u'complex'
>>> doc['mydocument']['plus']['#text']
u'element as well'
```

It's very fast ([Expat](http://docs.python.org/library/pyexpat.html)-based) and has a streaming mode with a small memory footprint, suitable for big XML dumps like [Discogs](http://discogs.com/data/) or [Wikipedia](http://dumps.wikimedia.org/):

```python
>>> def handle_artist(_, artist):
... print artist['name']
>>>
>>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),
... item_depth=2, item_callback=handle_artist)
A Perfect Circle
Fantômas
King Crimson
Chris Potter
...
```

It can also be used from the command line to pipe objects to a script like this:

```python
import sys, marshal
while True:
_, article = marshal.load(sys.stdin)
print article['title']
```

```sh
$ cat enwiki-pages-articles.xml.bz2 | bunzip2 | xmltodict.py 2 | myscript.py
AccessibleComputing
Anarchism
AfghanistanHistory
AfghanistanGeography
AfghanistanPeople
AfghanistanCommunications
Autism
...
```

Or just cache the dicts so you don't have to parse that big XML file again. You do this only once:

```sh
$ cat enwiki-pages-articles.xml.bz2 | bunzip2 | xmltodict.py 2 | gzip > enwiki.dicts.gz
```

And you reuse the dicts with every script that needs them:

```sh
$ cat enwiki.dicts.gz | gunzip | script1.py
$ cat enwiki.dicts.gz | gunzip | script2.py
...
```

You can also convert in the other direction, using the `unparse()` method:

```python
>>> mydict = {
... 'page': {
... 'title': 'King Crimson',
... 'ns': 0,
... 'revision': {
... 'id': 547909091,
... }
... }
... }
>>> print unparse(mydict)
<?xml version="1.0" encoding="utf-8"?>
<page><ns>0</ns><revision><id>547909091</id></revision><title>King Crimson</title></page>
```

## Ok, how do I get it?

You just need to

```sh
$ pip install xmltodict
```

There is an [official Fedora package for xmltodict](https://admin.fedoraproject.org/pkgdb/acls/name/python-xmltodict). If you are on Fedora or RHEL, you can do:

```sh
$ sudo yum install python-xmltodict
```

## Donate

If you love `xmltodict`, consider supporting the author [on Gittip](https://www.gittip.com/martinblech/).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xmltodict-0.5.1.tar.gz (10.5 kB view details)

Uploaded Source

File details

Details for the file xmltodict-0.5.1.tar.gz.

File metadata

  • Download URL: xmltodict-0.5.1.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for xmltodict-0.5.1.tar.gz
Algorithm Hash digest
SHA256 2e87a9016c5c388711182b0d1aefc7fa45dc067b43242216603bd1c7e6a05c56
MD5 d80a7ea096e4f7ff90626b7b2440b418
BLAKE2b-256 aa735a8e5d9e18423d51ea8b97dc4845ee6534af6dc02b05b28126df396b8e58

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page