Tools to manipulate and extract data from wikipedia dumps
Project description
wikidump
Introduction
This module contains code for manipulating wikipedia dumps available from http://download.wikimedia.org/backup-index.html
Configuration
Upon first importing the module, a file ‘wikidump.cfg’ will be created. Modify the paths in this file to point to your data.
scratch : where indices are stores (must be writeable)
xml_dumps : where the xml dumps are located (can be read-only)
Usage
In addition to python modules, wikidump also comes with a command-line tool to quickly access wikidump functionality. Run wikidump help for a list of options.
Credits
News
0.1
Release date: 04-Aug-2010
Initial release of wikidump module
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.