a request obfuscator and web scraping toolkit
Project description
mosquito
a request obfuscator and web scraping toolkit
mosquito gives you an API similar to requests and in fact uses
it internally. Each HTTP request exposes a number of information such as user agent or
IP address that allows a server to identify you or your application. mosquito lets you set
up multiple identities and schedules your requests to them. Each identity may consist of a whole
bunch of attributes that are supported by requests's
session object e.g.: headers,
proxies or cookies. To list all attributes available execute
mosquito.available_attributes()
.
Installation
from PyPI
pip install mosquito
Usage
demo/demo.py
#!/usr/bin/env python3 # Standard library modules. # Third party modules. # Local modules import mosquito from mosquito.tests import httpbin # Globals and constants variables. # Register attribute callback using a decorator ... @mosquito.attribute('headers') def headers(): for name in ('linux', 'mac', 'windows'): yield {'user-agent': name} # ... or register attributes by hand. mosquito.register_attributes(delay=.0, params=[{'foo': 42}, {'bar': 13, 'baz': 37}]) # Let's list all available attributes. print('available:', mosquito.available_attributes()) with mosquito.swarm(repeat_on=(503,), max_attempts=3) as scheduler: # Note that the swarm uses 2 sessions only, determined by the minimum length of passed # attributes which is `params` in our case. print(f'swarm uses {len(scheduler.swarm)} sessions') for i in range(5): # `swarm wraps` requests' api and therefore supports get, post, put etc. # parameters passed directly to request method will overwrite such registered before result = scheduler.get(httpbin('/user-agent'), params=dict(bar=0)) print(i, result.url, result.json()) # Let's provoke an error ... try: scheduler.get(httpbin('/status/404')) except mosquito.MosquitoError as mre: print(mre) # ... and another one, being more obstinate this time try: scheduler.get(httpbin('/status/503')) except mosquito.MosquitoError as mre: print(mre)
Testing
Some unit tests require a httpbin instance which is httpbin.org by default. For sake of speed and reliability it's recommended to run your own instance using the docker image. Check hub.docker.com/r/kennethreitz/httpbin for more information.
# run httpbin server using podman (works the same with docker) podman run -p 8080:80 kennethreitz/httpbin # let mosquito know its location by setting an environment variable export HTTPBIN_BASE_URL=http://localhost:8080
The actual test is ran by:
python -m mosquito.tests
Feedback
For feedback of any kind write an issue at gitlab.com. Thank you for using mosquito.
mosquito \ /
\ | /
/ \ | / \
\ \|/ /
\, o^o ,/
\,/"\,/
,,,,----,{/X\},----,,,,
,,---'''' _-'{\X/}'-_ ''''---,,
/' ,-'/ \V/ \'-, '\
( ,--''/ | (_) | \''--, )
'--,,-'' | | /_\ | | ''-,,--'
/' | (_-_) | '\
/ /' \_/ '\ \
/ / (_) \ \
/ V \
/ \
/ \
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.