Skip to main content

Tools for Web-Scraping

Project description

random_scraper

Tools for Webscraping

This package provides simple methods for scraping data anonymously and avoid getting your IP blocked by web servers. In particular, a better approach consists in using proxy servers to change IP addresses over time as well as user agents. There are both free and paid proxy servers available online. Unfortunately, the free proxies may be slow and unreliable which may result in missing data.

This package automatically collects and updates available free proxies online. It also provides a list of user agents and a user-friendly tool to request a page anonymously.

Please send feedback and comments to mab2343@columbia.edu.

Next steps:

  • Write a detailed documentation and examples
  • Update the request_page function to scrape AJAX websites

Note: We are not responsible for the wrongful usage of the tools provided. Please scrape content responsibly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

random_scraper-0.0.3.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

random_scraper-0.0.3-py3-none-any.whl (25.7 kB view details)

Uploaded Python 3

File details

Details for the file random_scraper-0.0.3.tar.gz.

File metadata

  • Download URL: random_scraper-0.0.3.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6

File hashes

Hashes for random_scraper-0.0.3.tar.gz
Algorithm Hash digest
SHA256 18bcef1cbe50608c3206c69bd8386a4404163f121f42c48e21a1a9b9f59346b2
MD5 ac79131798fbebb236271db5d5bba2c8
BLAKE2b-256 b4dfa0a315fc978f9e6a52295817bbeb1123c78e96dba7fcce63ceded58e220b

See more details on using hashes here.

File details

Details for the file random_scraper-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: random_scraper-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 25.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6

File hashes

Hashes for random_scraper-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8ec5d740c2c5260f86e41184dcf9c519971f292d142f40d5bd1338227ad5449f
MD5 18971f0bbf1b3c2b008bf33831723a94
BLAKE2b-256 1968d0646497fbd5fec5ea941d32a2520fb633596694f9a007d2fdeeeb649427

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page