Python asynchronous library for web scraping
Project description
Python async library for web scraping
Installing
pip install aioscrapy
Usage
Plain text scraping
import asyncio
import json
from aioscrapy import Client, WebTextClient, SingleSessionPool, Dispatcher, SimpleWorker
class CustomClient(Client[str, dict]):
def __init__(self, client: WebTextClient):
self._client = client
async def fetch(self, key: str) -> dict:
data = await self._client.fetch(key)
return json.loads(data)
async def main():
pool = SingleSessionPool()
dispatcher = Dispatcher(['https://httpbin.org/get'])
client = CustomClient(WebTextClient(pool))
worker = SimpleWorker(dispatcher, client)
result = await worker.run()
return result
loop = asyncio.get_event_loop()
print(loop.run_until_complete(main()))
Byte content downloading
import asyncio
from aioscrapy import Client, WebByteClient, SingleSessionPool, Dispatcher, SimpleWorker
class CustomClient(Client[str, bytes]):
def __init__(self, client: WebByteClient):
self._client = client
async def fetch(self, key: str) -> bytes:
data = await self._client.fetch(key)
return data
async def main():
pool = SingleSessionPool()
dispatcher = Dispatcher(['https://httpbin.org/image'])
client = CustomClient(WebByteClient(pool))
worker = SimpleWorker(dispatcher, client)
result = await worker.run()
return result
loop = asyncio.get_event_loop()
data: dict = loop.run_until_complete(main())
for url, byte_content in data.items():
print(url + ": " + str(len(byte_content)) + " bytes")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
aioscrapy-0.1.8.tar.gz
(5.5 kB
view details)
Built Distribution
File details
Details for the file aioscrapy-0.1.8.tar.gz
.
File metadata
- Download URL: aioscrapy-0.1.8.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.25.0 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3797a3c0a50b94d5a541acc78dd1cfa2ac3a1d740b483afd73e8934cb4fb3d5a |
|
MD5 | d45612cd83a0805b3dc053fc68c01529 |
|
BLAKE2b-256 | b796b8b4b97d61f5ecfadda75fb06695719a7ce94301943fc449bc60ac316b98 |
File details
Details for the file aioscrapy-0.1.8-py3-none-any.whl
.
File metadata
- Download URL: aioscrapy-0.1.8-py3-none-any.whl
- Upload date:
- Size: 7.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.25.0 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4fc494b0be7c7095f49619ef655350374708d1491d09afb23ab286389b35f893 |
|
MD5 | d55c6d060e3a1b4485d32b814f501cbe |
|
BLAKE2b-256 | b40fa0595dc3a92f443841d2266e2565a3594109f27762e870df6fad2de1442b |