Skip to main content

Various data and utilities for processing wikitext.

Project description

mwconstants

Various utilities and constants useful for analyses of wikitext. This package contains three types of artifacts:

  • Data generating functions: Python functions for calling various APIs to build useful data structures -- e.g., all Wikipedia language codes
  • Static data snapshots: Python variables that contain the most recent result of a data generating function
  • Utilities: Python functions for handling various wikitext-related processing tasks -- e.g., mapping links to namespaces.

Installation

You can install mwconstants with pip:

   $ pip install mwconstants

Basic Usage

from mwconstants import link_to_namespace, NON_WHITESPACE_LANGUAGES

print(link_to_namespace('Utilisateur:Isaac_(WMF)', lang='fr'))  # 'User'
print(sorted(NON_WHITESPACE_LANGUAGES))  # ['bo', 'bug', ..., 'zh-classical', 'zh-yue']

Modules

All modules generally contain relevant constants, functions for generating those constants, and other useful utilities for manipulating them:

  • languages.py: functions for identifying languages associated with a given Wikimedia project.
  • media.py: functions for identifying media in wikitext and parsing wikitext media syntax into its components
  • namespaces.py: functions for identifying namespace prefixes

Limitations

  • Links have many edge-cases, especially around interwiki prefixes. For now, just the basics are covered: language-specific namespaces and interlanguage links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mwconstants-0.1.0.tar.gz (90.5 kB view hashes)

Uploaded Source

Built Distribution

mwconstants-0.1.0-py3-none-any.whl (93.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page