Rate limit function calls based on URL's host.
Project description
Per-host rate limiting for asynchronous web scraping
Usage
from host_rate_limit import RateLimiter host_requests_per_sec = { 'www.amazon.com': (1.25, 3), # wait between 1/3 and 1/1.25 seconds between dispatching requests to amazon.com 'www.ebay.com': 0.33, # wait 1/0.33 second between dispatching requests to amazon.com 'twitter.com': 1, # send 1 request per second to twitter.com } rate_limiter = RateLimiter(default_rate_limit=(1,2), # send between 1 and 2 requests per second to every host that is not listed in host_requests_per_sec host_requests_per_sec=host_requests_per_sec) async def request(url): await rate_limiter.maybe_sleep(url) # request would be dispatched now.. # response = await http_client.get(url) # asynchronously dispatch a batch of requests. # urls: List[str] await asyncio.gather( *[asyncio.create_task(request(url)) for url in urls])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
host_rate_limit-1.0.0.tar.gz
(3.3 kB
view hashes)