Skip to main content

Tools for Web-Scraping

Project description

random_scraper

Tools for Webscraping

This package provides simple methods for scraping data anonymously and avoid getting your IP blocked by web servers. In particular, a better approach consists in using proxy servers to change IP addresses over time as well as user agents. There are both free and paid proxy servers available online. Unfortunately, the free proxies may be slow and unreliable which may result in missing data.

This package automatically collects and updates available free proxies online. It also provides a list of user agents and a user-friendly tool to request a page anonymously.

Please send feedback and comments to mab2343@columbia.edu.

Next steps:

  • Write a detailed documentation and examples
  • Update the request_page function to scrape AJAX websites

Note: We are not responsible for the wrongful usage of the tools provided. Please scrape content responsibly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

random_scraper-0.0.3.tar.gz (13.2 kB view hashes)

Uploaded Source

Built Distribution

random_scraper-0.0.3-py3-none-any.whl (25.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page