Skip to main content

A library for processing image features in a dataframe.

Project description

scrapesession

PyPi

A requests session meant for scraping with caching, backoffs and historical fallbacks.

Dependencies :globe_with_meridians:

Python 3.11.6:

Raison D'être :thought_balloon:

scrapesession is a requests session that performs heavy caching and other tools in order to efficiently scrape sites.

Architecture :triangular_ruler:

scrapesession is a requests session that has the following properties:

  1. Handle non 302 redirects (such as in javascript).
  2. Retries and backoffs.
  3. User-Agent rotations.
  4. Caching.
  5. Wayback Machine integration.

Installation :inbox_tray:

This is a python package hosted on pypi, so to install simply run the following command:

pip install scrapesession

or install using this local repository:

python setup.py install --old-and-unmanageable

Usage example :eyes:

The use of scrapesession is entirely through code due to it being a library. It has exactly the same semantics as a requests session:

from scrapesession.scrapesession import create_scrape_session


session = create_scrape_session()

response = session.get("http://www.helloworld.com")
print(response.text)

License :memo:

The project is available under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapesession-0.0.3.tar.gz (8.1 kB view details)

Uploaded Source

File details

Details for the file scrapesession-0.0.3.tar.gz.

File metadata

  • Download URL: scrapesession-0.0.3.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.6

File hashes

Hashes for scrapesession-0.0.3.tar.gz
Algorithm Hash digest
SHA256 95b704aaa996ba92763d2cbb381621edd156f56e946d1c5c03c0efa7556b307a
MD5 2156eea99d81d52e64190cfcd0ad6013
BLAKE2b-256 7d61afa91b7eb34822af033d7bdd5828a81fb25731e6d7efbf00566e2828e3b2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page