Skip to main content

A library for processing image features in a dataframe.

Project description

scrapesession

PyPi

A requests session meant for scraping with caching, backoffs and historical fallbacks.

Dependencies :globe_with_meridians:

Python 3.11.6:

Raison D'être :thought_balloon:

scrapesession is a requests session that performs heavy caching and other tools in order to efficiently scrape sites.

Architecture :triangular_ruler:

scrapesession is a requests session that has the following properties:

  1. Handle non 302 redirects (such as in javascript).
  2. Retries and backoffs.
  3. User-Agent rotations.
  4. Caching.
  5. Wayback Machine integration.

Installation :inbox_tray:

This is a python package hosted on pypi, so to install simply run the following command:

pip install scrapesession

or install using this local repository:

python setup.py install --old-and-unmanageable

Usage example :eyes:

The use of scrapesession is entirely through code due to it being a library. It has exactly the same semantics as a requests session:

from scrapesession.scrapesession import create_scrape_session


session = create_scrape_session()

response = session.get("http://www.helloworld.com")
print(response.text)

License :memo:

The project is available under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapesession-0.0.15.tar.gz (10.2 kB view details)

Uploaded Source

File details

Details for the file scrapesession-0.0.15.tar.gz.

File metadata

  • Download URL: scrapesession-0.0.15.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.6

File hashes

Hashes for scrapesession-0.0.15.tar.gz
Algorithm Hash digest
SHA256 55adecba791db0718e47b3789bc944781b1205ddc49347d3ead24702bf1aced9
MD5 bd0ef5a8c062ef505b8cfe8dba9f2c36
BLAKE2b-256 94cd683553a805a8986065f2ff40ad28bddbc7f08923188404207c8b8efb2d6e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page