Skip to main content

Make ZIM file from WikiHow articles

Project description

wikiHow

wikihow2zim is an OpenZIM scraper to create offline versions of wikiHow websites, in all its supported languages.

:zap: Scraper is known to have a very significant issue linked to throttling (https://github.com/openzim/wikihow/issues/150)

CodeFactor Docker License: GPL v3 PyPI version shields.io

Usage

wikihow2zim works off a language version that you must provide via the --language argument. The list of supported languages is visible in the --help message.

Docker

docker run -v my_dir:/output ghcr.io/openzim/wikihow wikihow2zim --help

Python

wikihow2zim is a Python3 (3.6+) software. If you are not using the Docker image, you are advised to use it in a virtual environment to avoid installing software dependencies on your system.

python3 -m venv env
source env/bin/activate

# using published version
pip3 install wikihow2zim
wikihow2zim --help

# running from source
python wikihow2zim/ --help

Call deactivate to quit the virtual environment.

See requirements.txt for the list of python dependencies.

Contributing

All contributions are welcome!

Please open an issue on Github and/or submit a Pull-request.

Guidelines

  • Don't take assigned issues. Comment if those get staled.
  • If your contribution is far from trivial, open an issue to discuss it first.
  • Ensure your code passed black formatting, isort and flake8 (88 chars)

We have a pre-commit hook ready for you. Install it with pip install pre-commit && pre-commit install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wikihow2zim-1.2.3.tar.gz (2.5 MB view hashes)

Uploaded Source

Built Distribution

wikihow2zim-1.2.3-py3-none-any.whl (2.6 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page