Allows to get an HTML, coming from several previous URLs. Sometimes this is needed to get webpages that requires cookies or a HTTP referrer to get a certain page.
Project description
Welcome to HTML-Jumping
- Author:
Daniel Perez Rada <@dperezrada>
What?
Allows to get an HTML, coming from several previous URLs. Sometimes this is needed to get webpages that requires cookies or a HTTP referrer to get a certain page.
Pre-requisites
You will need:
httplib2
http://socksipy.sourceforge.net/ (if you want to use a proxy)
To run the test you will also need:
lxml
Example
No proxy
from html_jumping import HtmlJumping handler = HtmlJumping() urls = [ { 'url': 'http://pypi.python.org/pypi', 'method': 'GET' }, { 'url': 'http://pypi.python.org/pypi', 'method': 'GET', 'body': { 'term': 'html_jumping', ':action': 'search', 'submit': 'search' } } ] received_header, received_content = handler.get(urls)
With proxy
Allow you to use a HTTP proxy, you will need to install the socksipy library.
from html_jumping import HtmlJumping handler = HtmlJumping() urls = [ { 'url': 'http://pypi.python.org/pypi', 'method': 'GET' }, { 'url': 'http://pypi.python.org/pypi', 'method': 'GET', 'body': { 'term': 'html_jumping', ':action': 'search', 'submit': 'search' } } ] received_header, received_content = handler.get( urls, proxy_info = {'host': '127.0.0.1', 'port': '8081'} )
With permanent headers
This will sent in each call the headers ‘Accept-Language’.
from html_jumping import HtmlJumping handler = HtmlJumping() urls = [ { 'url': 'http://pypi.python.org/pypi', 'method': 'GET' }, { 'url': 'http://pypi.python.org/pypi', 'method': 'GET', 'body': { 'term': 'html_jumping', ':action': 'search', 'submit': 'search' } } ] received_header, received_content = handler.get( urls, permanent_headers = {'Accept-Language': 'es, en-cl;q=0.5'} )
Tests
Run
>> nosetests
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file html_jumping-0.2.4.tar.gz
.
File metadata
- Download URL: html_jumping-0.2.4.tar.gz
- Upload date:
- Size: 3.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8491722577a206530b09b80ea741e64fe3d96df8ac2e9c3683eb00f43a2ba5cf |
|
MD5 | e1b850e7a3e0c175c01ea53305bddf1d |
|
BLAKE2b-256 | 143696a45ddfca6766c29d18860072d0eb1f6b96c12eea6467e07008bda90857 |