Skip to main content

A library for processing image features in a dataframe.

Project description

scrapesession

PyPi

A requests session meant for scraping with caching, backoffs and historical fallbacks.

Dependencies :globe_with_meridians:

Python 3.11.6:

Raison D'être :thought_balloon:

scrapesession is a requests session that performs heavy caching and other tools in order to efficiently scrape sites.

Architecture :triangular_ruler:

scrapesession is a requests session that has the following properties:

  1. Handle non 302 redirects (such as in javascript).
  2. Retries and backoffs.
  3. User-Agent rotations.
  4. Caching.
  5. Wayback Machine integration.

Installation :inbox_tray:

This is a python package hosted on pypi, so to install simply run the following command:

pip install scrapesession

or install using this local repository:

python setup.py install --old-and-unmanageable

Usage example :eyes:

The use of scrapesession is entirely through code due to it being a library. It has exactly the same semantics as a requests session:

from scrapesession.scrapesession import create_scrape_session


session = create_scrape_session()

response = session.get("http://www.helloworld.com")
print(response.text)

License :memo:

The project is available under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapesession-0.0.11.tar.gz (10.1 kB view details)

Uploaded Source

File details

Details for the file scrapesession-0.0.11.tar.gz.

File metadata

  • Download URL: scrapesession-0.0.11.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.6

File hashes

Hashes for scrapesession-0.0.11.tar.gz
Algorithm Hash digest
SHA256 49cd3595698ae77c32fbcc35d69a7ef4b3df821a5fd3f42af072a809a8a95045
MD5 362148bc1bd71ce545a414d7b60c09c7
BLAKE2b-256 cfeac47c61bdbc998eb28373103ae066cd878c4ac80f8269157b1e02e6705fa0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page