Skip to main content

OXL Wiki-Download Script for MediaWiki instances

Project description

MediaWiki Download Script

Lint Test

This is a very simple script to download the whole contents of MediaWiki instance via its API.

This script should not be used to download huge Wiki's as it will not scale well!


Features

  • Downloading the content of all pages in their source-format
    • Only downloading the latest revision (rvlimit)
  • Optionally converting all source-files to Markdown-format (using pandoc)

We did not yet have the need to also download the images. Feel free to open a PR!


Usage

Install:

pip install download-mediawiki

# or clone & run directly
python3 src/download_mediawiki/

Arguments:

download-mediawiki --help
> usage: MediaWiki Download Script (© OXL IT Services, License: MIT) [-h] -u URL [-o OUT_DIR] [-r]
>                                                                    [-m]
> 
> options:
>   -h, --help            show this help message and exit
>   -u URL, --url URL     Base-URL of the MediaWiki instance
>   -o OUT_DIR, --out-dir OUT_DIR
>   -r, --replace         Replace/Update existing pages
>   -m, --convert-to-md   Convert all source-files to Markdown-format (pandoc executable
>                         required!)

Result

Overview: dump/overview.json

# namespace-id => page-id => page-title
cat dump/overview.json 
> {
>   "0": {
>     "1": "Page Title"
>   },
>   "15": {}
> }

Files: dump//.mw

tree dump/
> dump/
> ├── 0
>    ├── 1.mw

Content

head dump/0/1.mw 
> # Main Page
> Welcome to the ''nftables'' HOWTO documentation page. Here you will find documentation on how to build, install, configure and use nftables.
> 
> If you have any suggestion to improve it, please send your comments to Netfilter users mailing list <netfilter@vger.kernel.org>.
> 
> 
> = [[News]] =
> 
> 
> = Introduction =

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

download_mediawiki-1.0.1.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

download_mediawiki-1.0.1-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file download_mediawiki-1.0.1.tar.gz.

File metadata

  • Download URL: download_mediawiki-1.0.1.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for download_mediawiki-1.0.1.tar.gz
Algorithm Hash digest
SHA256 5203c4d31f00afa7ee58baae3968124e8cd830088e947aa01717e28a6c14d8d4
MD5 4c3cadcb9e2435bd42f57de2ae3d2c8e
BLAKE2b-256 90bd5115cf04adefa1b8af627e008c3b1821fcdccbd926467ea782e109ea3396

See more details on using hashes here.

File details

Details for the file download_mediawiki-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for download_mediawiki-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 02ad09afa1dcdc34631b1751a8a9d798c2004c04ea8150b5bae12d5df7c5386c
MD5 1886f4af5e44d08212665331b66c3c2f
BLAKE2b-256 a053b8b909151c3382c73d1b830eabc6aa80b46061e7510c5161ad5273ba560a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page