No project description provided
Project description
scrapelib is a library for making requests to less-than-reliable websites.
Source: https://github.com/jamesturk/scrapelib
Documentation: https://jamesturk.github.io/scrapelib/
Issues: https://github.com/jamesturk/scrapelib/issues
Features
scrapelib originated as part of the Open States project to scrape the websites of all 50 state legislatures and as a result was therefore designed with features desirable when dealing with sites that have intermittent errors or require rate-limiting.
Advantages of using scrapelib over using requests as-is:
- HTTP(S) and FTP requests via an identical API
- support for simple caching with pluggable cache backends
- highly-configurable request throtting
- configurable retries for non-permanent site failures
- All of the power of the suberb requests library.
Installation
scrapelib is on PyPI, and can be installed via any standard package management tool:
poetry add scrapelib
or:
pip install scrapelib
Example Usage
import scrapelib
s = scrapelib.Scraper(requests_per_minute=10)
# Grab Google front page
s.get('http://google.com')
# Will be throttled to 10 HTTP requests per minute
while True:
s.get('http://example.com')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file scrapelib-2.3.0.tar.gz
.
File metadata
- Download URL: scrapelib-2.3.0.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.11.4 Darwin/23.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e99b327340b2a9162e1598a8c0664259d16eddef9ebb8389f93ca5428f3c58da |
|
MD5 | 5767c096d9692ab3343ca4681671b402 |
|
BLAKE2b-256 | d2791a285d79e417ef509a84f9a6b58f106bb438b13b16228b9681c32d688c8c |
File details
Details for the file scrapelib-2.3.0-py3-none-any.whl
.
File metadata
- Download URL: scrapelib-2.3.0-py3-none-any.whl
- Upload date:
- Size: 17.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.11.4 Darwin/23.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4004b717ebe916533c9937b7671fcbe7ef64d998fb54fcad54a5497fc276a7bf |
|
MD5 | 470e8740e2580171d5b590fb602db031 |
|
BLAKE2b-256 | 82d689bbe05bcc4edf399473c8f4f52f61534ff40cb2be60ee60381959fbfc72 |