Skip to main content

Tools for getting data from MediaWiki websites

Project description

MediaWiki Tools

Coverage Status

A high level library containing set of tools for for filtering pages using the rich data available in MediaWikis such as categories and info boxes. Uses both web-scraping and API methods (where available and feasible) to gather information.

Goals

  • Generate useful data (and datasets) from a wiki.
  • To work on any MediaWiki (including fandom.com) with or without api.
  • Get arbitrary subsets of pages based on categories and template parameters (todo).
  • Be very robust to variations and inconsistencies in user input.
  • Be efficient.

Installation

Install it using pip.

pip install mediawiki-tools

Requires python >3.8 because I like the walrus operator.

Usage

Check out the basic usage guide and detailed API documentation.

Example

Question: Which countries in Asia use english as spoken Language?

Answer:

import mwtools

wiki = MediaWikiTools('en.wikipedia.org')

wiki.get_set(['Countries in Asia', 
              'English-speaking countries and territories'], 
             'and')
# ['Philippines', 'Pakistan', 'Bahrain', 'Singapore', 'Brunei', 'India']

Question: Which countries in Asia or Europe use english as spoken Language?

Answer:

wiki.get_set(['Countries in Asia', 'Countries in Europe',
              'English-speaking countries and territories'], 
             ['or','and'])
# ['Philippines',
#  'United Kingdom',
#  'Brunei',
#  'Malta',
#  'India',
#  'Pakistan',
#  'Scotland',
#  'Republic of Ireland',
#  'Singapore',
#  'Bahrain']

Question: Which of these countries are not island nations?

Answer:

wiki.get_set(['Countries in Asia', 'Countries in Europe',
              'English-speaking countries and territories',
              'Island countries'], 
             ['or', 'and', 'not'])
# ['Pakistan', 'India']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

MediaWiki-Tools-0.1.1.tar.gz (9.7 kB view hashes)

Uploaded Source

Built Distribution

MediaWiki_Tools-0.1.1-py3-none-any.whl (9.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page