This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

Tools to manipulate and extract data from wikipedia dumps

Project Description


This module contains code for manipulating wikipedia dumps available from


This module is published on `PyPI`_ and can be installed with easy_install

For example:

easy_install wikidump

Alternatively, you can use pip:

pip install wikidump

I highly recommend using `virtualenv`_ to isolate the install environment.

For those on ubuntu systems, a built package is available in a `PPA`_.
Please go to the PPA for details on how to install from it.

.. _PyPI:
.. _virtualenv:
.. _PPA:


Upon first importing the module, a file 'wikidump.cfg' will be created.
Modify the paths in this file to point to your data.

- scratch : where indices are stores (must be writeable)
- xml_dumps : where the xml dumps are located (can be read-only)


In addition to python modules, wikidump also comes with a command-line
tool to quickly access wikidump functionality. Run `wikidump help`
for a list of options.


- `Distribute`_
- `Buildout`_
- `modern-package-template`_

.. _Buildout:
.. _Distribute:
.. _`modern-package-template`:

.. This is your project NEWS file which will contain the release notes.
.. Example:
.. The content of this file, along with README.rst, will appear in your
.. project's PyPI page.



*Release date: 04-Aug-2010*

* Initial release of wikidump module


*Release date: 10-Apr-2013*

* Rewrote CLI
Release History

Release History

This version
History Node


History Node


History Node


History Node


Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
wikidump-0.1.3.tar.gz (17.3 kB) Copy SHA256 Checksum SHA256 Source Apr 10, 2013

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting