Skip to main content

Python asynchronous library for web scraping

Project description

Python async library for web scraping

PyPI version License: MIT

Build Status codecov codebeat badge Codacy Badge

Installing

pip install aioscrapy

Usage

Plain text scraping

import asyncio
import json

from aioscrapy import Client, WebTextClient, SingleSessionPool, Dispatcher, SimpleWorker


class CustomClient(Client[str, dict]):
    def __init__(self, client: WebTextClient):
        self._client = client

    async def fetch(self, key: str) -> dict:
        data = await self._client.fetch(key)
        return json.loads(data)


async def main():
    pool = SingleSessionPool()
    dispatcher = Dispatcher(['https://httpbin.org/get'])
    client = CustomClient(WebTextClient(pool))
    worker = SimpleWorker(dispatcher, client)

    result = await worker.run()
    return result

loop = asyncio.get_event_loop()
print(loop.run_until_complete(main()))

Byte content downloading

import asyncio

from aioscrapy import Client, WebByteClient, SingleSessionPool, Dispatcher, SimpleWorker


class CustomClient(Client[str, bytes]):
    def __init__(self, client: WebByteClient):
        self._client = client

    async def fetch(self, key: str) -> bytes:
        data = await self._client.fetch(key)
        return data


async def main():
    pool = SingleSessionPool()
    dispatcher = Dispatcher(['https://httpbin.org/image'])
    client = CustomClient(WebByteClient(pool))
    worker = SimpleWorker(dispatcher, client)

    result = await worker.run()
    return result

loop = asyncio.get_event_loop()
data: dict = loop.run_until_complete(main())
for url, byte_content in data.items():
    print(url + ": " + str(len(byte_content)) + " bytes")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aioscrapy-0.1.8.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

aioscrapy-0.1.8-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file aioscrapy-0.1.8.tar.gz.

File metadata

  • Download URL: aioscrapy-0.1.8.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.25.0 CPython/3.7.4

File hashes

Hashes for aioscrapy-0.1.8.tar.gz
Algorithm Hash digest
SHA256 3797a3c0a50b94d5a541acc78dd1cfa2ac3a1d740b483afd73e8934cb4fb3d5a
MD5 d45612cd83a0805b3dc053fc68c01529
BLAKE2b-256 b796b8b4b97d61f5ecfadda75fb06695719a7ce94301943fc449bc60ac316b98

See more details on using hashes here.

File details

Details for the file aioscrapy-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: aioscrapy-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.25.0 CPython/3.7.4

File hashes

Hashes for aioscrapy-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 4fc494b0be7c7095f49619ef655350374708d1491d09afb23ab286389b35f893
MD5 d55c6d060e3a1b4485d32b814f501cbe
BLAKE2b-256 b40fa0595dc3a92f443841d2266e2565a3594109f27762e870df6fad2de1442b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page