Skip to main content

A library for processing image features in a dataframe.

Project description

scrapesession

PyPi

A requests session meant for scraping with caching, backoffs and historical fallbacks.

Dependencies :globe_with_meridians:

Python 3.11.6:

Raison D'être :thought_balloon:

scrapesession is a requests session that performs heavy caching and other tools in order to efficiently scrape sites.

Architecture :triangular_ruler:

scrapesession is a requests session that has the following properties:

  1. Handle non 302 redirects (such as in javascript).
  2. Retries and backoffs.
  3. User-Agent rotations.
  4. Caching.
  5. Wayback Machine integration.

Installation :inbox_tray:

This is a python package hosted on pypi, so to install simply run the following command:

pip install scrapesession

or install using this local repository:

python setup.py install --old-and-unmanageable

Usage example :eyes:

The use of scrapesession is entirely through code due to it being a library. It has exactly the same semantics as a requests session:

from scrapesession.scrapesession import create_scrape_session


session = create_scrape_session()

response = session.get("http://www.helloworld.com")
print(response.text)

License :memo:

The project is available under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapesession-0.0.4.tar.gz (9.1 kB view details)

Uploaded Source

File details

Details for the file scrapesession-0.0.4.tar.gz.

File metadata

  • Download URL: scrapesession-0.0.4.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.6

File hashes

Hashes for scrapesession-0.0.4.tar.gz
Algorithm Hash digest
SHA256 820d9b86504334e708234eadfbfcf04bcf2ca1333ebf5a00f996f730d7ad15eb
MD5 f458729ca27468e298fb9135e6933fd4
BLAKE2b-256 139edf4bc59bc14499b35eaff601da57cf84fc79b01ce90789930c0f0cde359a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page