Skip to main content

Like `requests`, but shittier.

Project description

### WebRequest

Like the `requests` library, but shittier.

Provides convenience functions for writing web-scrapers and other web-interactive
things. Built-in support for working through CloudFlare's garbage JS browser
checks without any intervention, as well as some other [garbage web-application
firewall shits](https://sucuri.net/website-firewall/) that seem intent on
breaking the internet.

Built-in user-agent randomization. Support for fetching rendered content via
headless chrome. Built on top of my [ChromeController](https://github.com/fake-name/ChromeController)
project, so it can avoid some of the [spectacularly stupid design
decisions](https://github.com/seleniumhq/selenium-google-code-issue-archive/issues/141) in selenium.

Default support for compressed transfers.

Basically, the overall goal is to have a simple library that acts *as much as
possible* like a "real" browser. Ideally, it should be indistinguishable from
an actual browser from the perspective of the remote HTTP(s) server.

Other useful bits:

API wrapper for 2captcha.com which includes automatic local reverse-proxy spinup
for reflecting recaptcha requests through your local host address. This involves
spinning up a transient SOCKS5 proxy, setting up a limited duration port-forward
(10 minutes, via UPnP), and actual synchronous captcha solving calls.

The proxy setup/UPnP setup functions are generic enough that they should be
useful for any captcha-solver-related tasks. If you'd like support for another
solver site (and are willing to throw a few bucks of credit on a solver site to
me), I can probably add that other site too.


Q: Why
A: Because I started writing horrible web-scraper things in 2008, when the
requests library wasn't really a thing.

Q: Why *still*, then?
A: Anger and spite, mostly.

Q: No, really, *why*
A: Ok, Because I want to download the internet, and idiots post stuff, and then
try to "protect" it from scraping with stupid jerberscript bullshit.

## Note: If your non-interactive webite requires me to execute javascript to view it, FUCK YOU, you are a horrible person who is actively ruining the internet.

License:
WTFPL



Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

WebRequest-0.0.49.tar.gz (31.2 kB view details)

Uploaded Source

File details

Details for the file WebRequest-0.0.49.tar.gz.

File metadata

  • Download URL: WebRequest-0.0.49.tar.gz
  • Upload date:
  • Size: 31.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/3.5

File hashes

Hashes for WebRequest-0.0.49.tar.gz
Algorithm Hash digest
SHA256 39f8a95ba2632886b6215897a8453e0247e7d4ec24aea700513ccc97f1ba65d9
MD5 9224e12e20e2aee160415d1c1b377b3a
BLAKE2b-256 afd3c3374e1559b686510ef639f3b163e6a9ec3255b91bfc342142e99805135c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page