This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

WXR-parser is a very simple parser for Wordpress eXtended RSS files, writtent in Python.

Feed it with a WXR file and it will return data about posts, categories, tags and comments, inside a dictionary, so you can use this data in your own projects.


WXR-parser has been tested under Python 2.7 and 3.4.


The recommanded install process is using pip, which will also handle any dependencies:

pip install wxr-parser

If you install it manually, you should also install lxml.


You can use the parser with the following instructions:

import wxr_parser

# parse a file
parsed_data = wxr_parser.parse('path_to_your_wxr.xml')

You can also parse a string containing WXR data.


WXR-parser returns a standard Python dictionary, with following keys:

  • site: data about the website. Not fully implemented.
  • categories: categories data (described below)
  • tags: tags data (described below)
  • posts: posts data (described below)


A dictionary of parsed categories, with categories nicenames as keys:


# output
    'a-category': {'slug': 'a-category',
                   'title': 'A category'},
    'another-category': {'slug': 'another-category',
                         'title': 'Another category'},
    'uncategorized': {'slug': 'uncategorized',
                      'title': 'Uncategorized'}


A dictionary of parsed tags, with tags nicenames as keys:


# output
    'another-tag': {'slug': 'another-tag',
                    'title': 'another tag'},
    'arbitrary-tag': {'slug': 'arbitrary-tag',
                      'title': 'arbitrary tag'},
    'some-tag': {'slug': 'some-tag',
                 'title': 'Some tag'}


A list of dictionaries, each dictionary corresponding to a parsed post:

# get the first parsed post

# output

    'categories': ['uncategorized'],
    'comment_status': 'open',
    'comments': [{'author': 'Mr WordPress',
                  'author_IP': None,
                  'author_url': '',
                  'content': 'Hi, this is a comment.<br />To delete a comment, just log in and view the post&#039;s comments. There you will have the option to edit or delete them.',
                  'date': datetime.datetime(2012, 7, 1, 18, 32, 32),
                  'id': 1}],
    'content': u'Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!',
    'creator': 'admin',
    'guid': u'',
    'id': 1,
    'link': u'',
    'password': None,
    'ping_status': 'open',
    'pub_date': datetime.datetime(2012, 7, 1, 18, 32, 32),
    'slug': u'hello-world',
    'status': 'publish',
    'tags': [],
    'title': u'Hello world!'


0.1 - 12/10/2014

Initial release.


Contributions and feedback are welcome. You can fork the project and send me a link to your forked repo so I can merge it.

Feel free to email me at <>.


The project is licensed under BSD licence.

Release History

Release History


This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
wxr-parser-0.1.tar.gz (6.2 kB) Copy SHA256 Checksum SHA256 Source Oct 12, 2014

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting