Skip to main content

Tools to manipulate and extract data from wikipedia dumps

Project description



This module contains code for manipulating wikipedia dumps available from


This module is published on PyPI and can be installed with easy_install

For example:

easy_install wikidump

Alternatively, you can use pip:

pip install wikidump

I highly recommend using virtualenv to isolate the install environment.

For those on ubuntu systems, a built package is available in a PPA. Please go to the PPA for details on how to install from it.


Upon first importing the module, a file ‘wikidump.cfg’ will be created. Modify the paths in this file to point to your data.

  • scratch : where indices are stores (must be writeable)
  • xml_dumps : where the xml dumps are located (can be read-only)


In addition to python modules, wikidump also comes with a command-line tool to quickly access wikidump functionality. Run wikidump help for a list of options.



Release date: 04-Aug-2010

  • Initial release of wikidump module


Release date: 10-Apr-2013

  • Rewrote CLI

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for wikidump, version 0.1.3
Filename, size File type Python version Upload date Hashes
Filename, size wikidump-0.1.3.tar.gz (17.3 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page