Skip to main content

Collection of python tools to re-use common code across scrapers

Project description

zimscraperlib

Build Status CodeFactor License: GPL v3 PyPI version shields.io codecov

Collection of python code to re-use across python-based scrapers

Usage

  • This library is meant to be installed via PyPI (zimscraperlib).
  • Make sure to reference it using a version code as the API is subject to frequent changes.
  • API should remain the same only within the same minor version.

Example usage:

zimscraperlib>=1.1,<1.2

Dependencies

  • libmagic
  • wget
  • libzim (auto-installed, not available on Windows)
  • Pillow
  • FFmpeg
  • gifsicle (>=1.92)

macOS

brew install libmagic wget libtiff libjpeg webp little-cms2 ffmpeg gifsicle

Linux

sudo apt install libmagic1 wget ffmpeg \
    libtiff5-dev libjpeg8-dev libopenjp2-7-dev zlib1g-dev \
    libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk \
    libharfbuzz-dev libfribidi-dev libxcb1-dev gifsicle

Users

Non-exhaustive list of scrapers using it (check status when updating API):

releasing

  • Update your dependencies: pip install -U setuptools wheel twine
  • Make sure CHANGELOG.md is up-to-date
  • Bump version on src/zimscraperlib/VERSION
  • Build packages python ./setup.py sdist bdist_wheel
  • Upload to PyPI twine upload dist/zimscraperlib-2.0.0*.
  • Commit your Changelog + version bump changes
  • Tag version on git git tag -a v2.0.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for zimscraperlib, version 1.3.5
Filename, size File type Python version Upload date Hashes
Filename, size zimscraperlib-1.3.5-py3-none-any.whl (100.5 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size zimscraperlib-1.3.5.tar.gz (78.5 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page