Skip to main content

No project description provided

Project description

scrapelib is a library for making requests to less-than-reliable websites.

This repository has moved to Codeberg, GitHub will remain as a read-only mirror.

Source: https://codeberg.org/jpt/scrapelib

Documentation: https://jamesturk.github.io/scrapelib/

Issues: https://codeberg.org/jpt/scrapelib/issues

PyPI badge Test badge

Features

scrapelib originated as part of the Open States project to scrape the websites of all 50 state legislatures and as a result was therefore designed with features desirable when dealing with sites that have intermittent errors or require rate-limiting.

Advantages of using scrapelib over using requests as-is:

  • HTTP(S) and FTP requests via an identical API
  • support for simple caching with pluggable cache backends
  • highly-configurable request throtting
  • configurable retries for non-permanent site failures
  • All of the power of the suberb requests library.

Installation

scrapelib is on PyPI, and can be installed via any standard package management tool.

Example Usage

  import scrapelib
  s = scrapelib.Scraper(requests_per_minute=10)

  # Grab Google front page
  s.get('http://google.com')

  # Will be throttled to 10 HTTP requests per minute
  while True:
      s.get('http://example.com')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapelib-2.4.1.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

scrapelib-2.4.1-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file scrapelib-2.4.1.tar.gz.

File metadata

  • Download URL: scrapelib-2.4.1.tar.gz
  • Upload date:
  • Size: 19.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.13

File hashes

Hashes for scrapelib-2.4.1.tar.gz
Algorithm Hash digest
SHA256 48340199e92e860a423aeae09ef03cb9c99b78213eddedc9a31c1545b9bf2b6a
MD5 4dc517af6bd6368583f39070b551810d
BLAKE2b-256 9e5b4207c24a2daf172cddaa9d994e2ac5397313390be72f1ffac293ac9ad624

See more details on using hashes here.

File details

Details for the file scrapelib-2.4.1-py3-none-any.whl.

File metadata

  • Download URL: scrapelib-2.4.1-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.13

File hashes

Hashes for scrapelib-2.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1332f8ab05ab3e23b3db3f62e33ffad68ac3d1f1ac9be9fdc9410d0a30a5bf59
MD5 a1b48f60b0741b19c01abeee4f521572
BLAKE2b-256 3d481f9e4e877f79d84b19c31b1753b0140d0ece723aa4b0506ec27e15845f8b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page