a request obfuscator and web scraping toolkit
Project description
mosquito
a request obfuscator and web scraping toolkit
mosquito gives you an API similar to requests and in fact uses
it internally. Each HTTP request exposes a number of information such as user agent or
IP address that allows a server to identify you or your application. mosquito lets you set
up multiple identities and schedules your requests to them. Each identity may consist of a whole
bunch of attributes that are supported by requests's
session object e.g.: headers,
proxies or cookies. To list all attributes available execute
mosquito.available_attributes()
.
Installation
from PyPI
pip install mosquito
Usage
demo/demo.py
#!/usr/bin/env python3
# Standard library modules.
# Third party modules.
# Local modules
import mosquito
from mosquito.tests import httpbin
# Globals and constants variables.
# Register attribute callback using a decorator ...
@mosquito.attribute('headers')
def headers():
for name in ('linux', 'mac', 'windows'):
yield {'user-agent': name}
# ... or register attributes by hand.
mosquito.register_attributes(delay=.0, params=[{'foo': 42}, {'bar': 13, 'baz': 37}])
# Let's list all available attributes.
print('available:', mosquito.available_attributes())
with mosquito.swarm(repeat_on=(503,), max_attempts=3) as scheduler:
# Note that the swarm uses 2 sessions only, determined by the minimum length of passed
# attributes which is `params` in our case.
print(f'swarm uses {len(scheduler.swarm)} sessions')
for i in range(5):
# `swarm wraps` requests' api and therefore supports get, post, put etc.
# parameters passed directly to request method will overwrite such registered before
result = scheduler.get(httpbin('/user-agent'), params=dict(bar=0))
print(i, result.url, result.json())
# Let's provoke an error ...
try:
scheduler.get(httpbin('/status/404'))
except mosquito.MosquitoError as mre:
print(mre)
# ... and another one, being more obstinate this time
try:
scheduler.get(httpbin('/status/503'))
except mosquito.MosquitoError as mre:
print(mre)
Testing
Some unit tests require a httpbin instance which is httpbin.org by default. For sake of speed and reliability it's recommended to run your own instance using the docker image. Check hub.docker.com/r/kennethreitz/httpbin for more information.
# run httpbin server using podman (works the same with docker)
podman run -p 8080:80 kennethreitz/httpbin
# let mosquito know its location by setting an environment variable
export HTTPBIN_BASE_URL=http://localhost:8080
The actual test is ran by:
python -m mosquito.tests
Feedback
For feedback of any kind write an issue at gitlab.com. Thank you for using mosquito.
mosquito \ /
\ | /
/ \ | / \
\ \|/ /
\, o^o ,/
\,/"\,/
,,,,----,{/X\},----,,,,
,,---'''' _-'{\X/}'-_ ''''---,,
/' ,-'/ \V/ \'-, '\
( ,--''/ | (_) | \''--, )
'--,,-'' | | /_\ | | ''-,,--'
/' | (_-_) | '\
/ /' \_/ '\ \
/ / (_) \ \
/ V \
/ \
/ \
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.