Python library that makes web scraping very simple.
Project description
Documentation is hosted at http://learnwebscraping.com/docs. Note: Documentation is currently being written.
Simplewebscraper is a library designed to facilitate webscraping. It has a lot of built in code for standard web requests, proxy usage, browser cookie imports, and file downloads.
Homepage: https://github.com/alexanderward/simplewebscraper
Simple Usage - More details to come once documentation is complete.
from simplewebscraper import Browser, HTTPMethod, Scraper, ProxyPool
if __name__ == "__main__":
example_GET = True
example_GET_parameters = True
example_POST = False
example_Proxy = False
example_cookie_import = False
if example_GET:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
if example_GET_parameters:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.parameters = {'InData': "75791",
"submit": "Search"}
my_scraper.url = "http://www.melissadata.com/lookups/GeoCoder.asp"
print my_scraper.fetch()
if example_POST:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.POST
my_scraper.parameters = {"email": "example@gmail.com",
"pass": "samplepassword"}
my_scraper.url = "https://www.dnsdynamic.org/auth.php"
print my_scraper.fetch()
if example_Proxy:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.use_per_proxy_count = 5
my_scraper.proxy_pool = ProxyPool.Hidester #You can provide a group of proxies like this as well {"https": ["https://212.119.246.138:8080"],"http": []}
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
if example_cookie_import:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.cookies = Browser.Chrome # Chrome or Firefox
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
Features
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
simplewebscraper-1.42.zip
(15.3 kB
view details)
Built Distribution
simplewebscraper-1.42.win32.exe
(211.6 kB
view details)
File details
Details for the file simplewebscraper-1.42.zip
.
File metadata
- Download URL: simplewebscraper-1.42.zip
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f0dfe236ff5e5ca81c796fa23784d21f0e94f58cc481884a9127c0a7acec549 |
|
MD5 | 5062a0f433780729836703140458d9ce |
|
BLAKE2b-256 | 8b62cc379cb3da4dfe3a97150090f0fcec9fb28529204e62e6223d9249ffe73a |
File details
Details for the file simplewebscraper-1.42.win32.exe
.
File metadata
- Download URL: simplewebscraper-1.42.win32.exe
- Upload date:
- Size: 211.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8269a7d3847eb95590a4b2de13b25e61b0053fe240d1a078bb3d549bbbdc07c4 |
|
MD5 | bfe227fe4969476fa3b64a72c8597a82 |
|
BLAKE2b-256 | eaa38bdc154593bfda03300aae3c1f7c51821659c0e27a0e5c02cf70f99a0e9c |