Skip to main content

OXL Wiki-Download Script for MediaWiki instances

Project description

MediaWiki Download Script

Lint Test

This is a very simple script to download the whole contents of MediaWiki instance via its API.

This script should not be used to download huge Wiki's as it will not scale well!


Features

  • Downloading the content of all pages in their source-format
    • Only downloading the latest revision (rvlimit)
  • Optionally converting all source-files to Markdown-format (using pandoc)

We did not yet have the need to also download the images. Feel free to open a PR!


Usage

Install:

pip install download-mediawiki

# or clone & run directly
python3 src/download_mediawiki/

Arguments:

download-mediawiki --help
> usage: MediaWiki Download Script (© OXL IT Services, License: MIT) [-h] -u URL [-o OUT_DIR] [-r]
>                                                                    [-m]
> 
> options:
>   -h, --help            show this help message and exit
>   -u URL, --url URL     Base-URL of the MediaWiki instance
>   -o OUT_DIR, --out-dir OUT_DIR
>   -r, --replace         Replace/Update existing pages
>   -m, --convert-to-md   Convert all source-files to Markdown-format (pandoc executable
>                         required!)

Result

Overview: dump/overview.json

# namespace-id => page-id => page-title
cat dump/overview.json 
> {
>   "0": {
>     "1": "Page Title"
>   },
>   "15": {}
> }

Files: dump//.mw

tree dump/
> dump/
> ├── 0
>    ├── 1.mw

Content

head dump/0/1.mw 
> # Main Page
> Welcome to the ''nftables'' HOWTO documentation page. Here you will find documentation on how to build, install, configure and use nftables.
> 
> If you have any suggestion to improve it, please send your comments to Netfilter users mailing list <netfilter@vger.kernel.org>.
> 
> 
> = [[News]] =
> 
> 
> = Introduction =

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

download_mediawiki-1.0.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

download_mediawiki-1.0-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file download_mediawiki-1.0.tar.gz.

File metadata

  • Download URL: download_mediawiki-1.0.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for download_mediawiki-1.0.tar.gz
Algorithm Hash digest
SHA256 4bf34bdd5d4cf357869c09a2f027ea7046bff9d7cd63a6588a91c0c0441e5ae6
MD5 e654fa7fa2b2a2f0c8a7b8aaf52d3a1b
BLAKE2b-256 ec5ecaa58646c8f48ae4525ede79606b06895ff584960246dd58cb4b5cc18920

See more details on using hashes here.

File details

Details for the file download_mediawiki-1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for download_mediawiki-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 25ced110ad156b92f67b296262b8122e06b282f756b50e3000b6a71949cbd0e0
MD5 0977cf6f1dc264d19660b0343bb92f02
BLAKE2b-256 fa36aa5a2ceba2fcb29b5a7060a822ffd99e24b71b5ad857334db494461670fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page