Python library that makes web scraping very simple.
Project description
Documentation is hosted at http://learnwebscraping.com/docs. Note: Documentation is currently being written.
Simplewebscraper is a library designed to facilitate webscraping. It has a lot of built in code for standard web requests, proxy usage, browser cookie imports, and file downloads.
Homepage: https://github.com/alexanderward/simplewebscraper
Simple Usage - More details to come once documentation is complete.
from simplewebscraper import Browser, HTTPMethod, Scraper, ProxyPool
if __name__ == "__main__":
example_GET = True
example_GET_parameters = True
example_POST = False
example_Proxy = False
example_cookie_import = False
if example_GET:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
if example_GET_parameters:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.parameters = {'InData': "75791",
"submit": "Search"}
my_scraper.url = "http://www.melissadata.com/lookups/GeoCoder.asp"
print my_scraper.fetch()
if example_POST:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.POST
my_scraper.parameters = {"email": "example@gmail.com",
"pass": "samplepassword"}
my_scraper.url = "https://www.dnsdynamic.org/auth.php"
print my_scraper.fetch()
if example_Proxy:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.use_per_proxy_count = 5
my_scraper.proxy_pool = ProxyPool.Hidester #You can provide a group of proxies like this as well {"https": ["https://212.119.246.138:8080"],"http": []}
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
if example_cookie_import:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.cookies = Browser.Chrome # Chrome or Firefox
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
Features
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
simplewebscraper-1.042.zip
(11.9 kB
view details)
Built Distribution
simplewebscraper-1.042.win32.exe
(210.6 kB
view details)
File details
Details for the file simplewebscraper-1.042.zip
.
File metadata
- Download URL: simplewebscraper-1.042.zip
- Upload date:
- Size: 11.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc6bd8d86a15708c9f082870ef005bd38e1920bd560d018cbadda4033fba218c |
|
MD5 | 9244f6f9961f107ea14949587a48265b |
|
BLAKE2b-256 | 77616f7b59ee025d94e1e30fd5fd0b85263b52a8553fc264ac57a68fbb44148a |
File details
Details for the file simplewebscraper-1.042.win32.exe
.
File metadata
- Download URL: simplewebscraper-1.042.win32.exe
- Upload date:
- Size: 210.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f085745f7ff60e47fd9b9a19969f688afccbe17c77bbb0efb5666b4bab95ecfe |
|
MD5 | 55349a6602fc7bc00b871ac826917972 |
|
BLAKE2b-256 | 39418705b4ced2781e701328c76e6138eaa7d44edf813908d6f739d39d05f12d |