Skip to main content

A library for processing image features in a dataframe.

Project description

scrapesession

PyPi

A requests session meant for scraping with caching, backoffs and historical fallbacks.

Dependencies :globe_with_meridians:

Python 3.11.6:

Raison D'être :thought_balloon:

scrapesession is a requests session that performs heavy caching and other tools in order to efficiently scrape sites.

Architecture :triangular_ruler:

scrapesession is a requests session that has the following properties:

  1. Handle non 302 redirects (such as in javascript).
  2. Retries and backoffs.
  3. User-Agent rotations.
  4. Caching.
  5. Wayback Machine integration.

Installation :inbox_tray:

This is a python package hosted on pypi, so to install simply run the following command:

pip install scrapesession

or install using this local repository:

python setup.py install --old-and-unmanageable

Usage example :eyes:

The use of scrapesession is entirely through code due to it being a library. It has exactly the same semantics as a requests session:

from scrapesession.scrapesession import create_scrape_session


session = create_scrape_session()

response = session.get("http://www.helloworld.com")
print(response.text)

License :memo:

The project is available under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapesession-0.0.17.tar.gz (10.3 kB view details)

Uploaded Source

File details

Details for the file scrapesession-0.0.17.tar.gz.

File metadata

  • Download URL: scrapesession-0.0.17.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.6

File hashes

Hashes for scrapesession-0.0.17.tar.gz
Algorithm Hash digest
SHA256 2b32ba2274a7be33aeb5092c8dc0f138ef078e9b006bad9595e4b183bfdae6a8
MD5 c4c54eb1f66470c76cafb2c12014a08e
BLAKE2b-256 0f257849848bdcb9b9c97bd7fa27ae1c044b27c2d7bdb91eb42a825cb6600161

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page