Skip to main content

Collection of python tools to re-use common code across scrapers

Project description

zimscraperlib

Build Status CodeFactor License: GPL v3 PyPI version shields.io PyPI - Python Version codecov Read the Docs

Collection of python code to re-use across python-based scrapers

Usage

  • This library is meant to be installed via PyPI (zimscraperlib).
  • Make sure to reference it using a version code as the API is subject to frequent changes.
  • API should remain the same only within the same minor version.

Example usage:

zimscraperlib>=1.1,<1.2

See documentation at Read the Docs for details.

Dependencies

  • libmagic
  • wget
  • libzim (auto-installed, not available on Windows)
  • Pillow
  • FFmpeg
  • gifsicle (>=1.92)
  • libcairo (if you use the image manipulation, this is used for svg conversion)

macOS

brew install libmagic wget libtiff libjpeg webp little-cms2 ffmpeg gifsicle

Linux

sudo apt install libmagic1 wget ffmpeg \
    libtiff5-dev libjpeg8-dev libopenjp2-7-dev zlib1g-dev \
    libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk \
    libharfbuzz-dev libfribidi-dev libxcb1-dev gifsicle

Alpine

apk add ffmpeg gifsicle libmagic wget libjpeg

Contribution

This project adheres to openZIM's Contribution Guidelines.

This project has implemented openZIM's Python bootstrap, conventions and policies v1.0.2.

pip install hatch
pip install ".[dev]"
pre-commit install
# For tests
invoke coverage

Users

Non-exhaustive list of scrapers using it (check status when updating API):

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zimscraperlib-5.0.0.tar.gz (6.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zimscraperlib-5.0.0-py3-none-any.whl (123.8 kB view details)

Uploaded Python 3

File details

Details for the file zimscraperlib-5.0.0.tar.gz.

File metadata

  • Download URL: zimscraperlib-5.0.0.tar.gz
  • Upload date:
  • Size: 6.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.8

File hashes

Hashes for zimscraperlib-5.0.0.tar.gz
Algorithm Hash digest
SHA256 9d72f1a51b0d0eed2a08989b19d3e85452c9fb1b907b461dba1d3eb8d88248e2
MD5 c48a2aced3f078896324c5a71f0d5249
BLAKE2b-256 e7ba0e007b48430d19992db687b31123631133170809c741c649b6f71ae083a5

See more details on using hashes here.

File details

Details for the file zimscraperlib-5.0.0-py3-none-any.whl.

File metadata

  • Download URL: zimscraperlib-5.0.0-py3-none-any.whl
  • Upload date:
  • Size: 123.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.8

File hashes

Hashes for zimscraperlib-5.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e7d5b8e10111ee53cccf4f7876c2d14e65c3f67e01fa4fb8d09555983b19f373
MD5 6022339bd1be1bd92ca4a5b4d4aee708
BLAKE2b-256 54697695a25e3b2f56bcd1b65b08b9eb9081a482fbbb3aaad5b9710da2d03604

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page