Skip to main content

Python asynchronous library for web scraping

Project description

Python async library for web scraping

PyPI version License: MIT

Build Status codecov codebeat badge Codacy Badge

Installing

pip install aioscrapy

Usage

Plain text scraping

import asyncio
import json

from aioscrapy import Client, WebTextClient, SingleSessionPool, Dispatcher, SimpleWorker


class CustomClient(Client[str, dict]):
    def __init__(self, client: WebTextClient):
        self._client = client

    async def fetch(self, key: str) -> dict:
        data = await self._client.fetch(key)
        return json.loads(data)


async def main():
    pool = SingleSessionPool()
    dispatcher = Dispatcher(['https://httpbin.org/get'])
    client = CustomClient(WebTextClient(pool))
    worker = SimpleWorker(dispatcher, client)

    result = await worker.run()
    return result

loop = asyncio.get_event_loop()
print(loop.run_until_complete(main()))

Byte content downloading

import asyncio

from aioscrapy import Client, WebByteClient, SingleSessionPool, Dispatcher, SimpleWorker


class CustomClient(Client[str, bytes]):
    def __init__(self, client: WebByteClient):
        self._client = client

    async def fetch(self, key: str) -> bytes:
        data = await self._client.fetch(key)
        return data


async def main():
    pool = SingleSessionPool()
    dispatcher = Dispatcher(['https://httpbin.org/image'])
    client = CustomClient(WebByteClient(pool))
    worker = SimpleWorker(dispatcher, client)

    result = await worker.run()
    return result

loop = asyncio.get_event_loop()
data: dict = loop.run_until_complete(main())
for url, byte_content in data.items():
    print(url + ": " + str(len(byte_content)) + " bytes")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aioscrapy-0.1.6.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

aioscrapy-0.1.6-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file aioscrapy-0.1.6.tar.gz.

File metadata

  • Download URL: aioscrapy-0.1.6.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.25.0 CPython/3.7.4

File hashes

Hashes for aioscrapy-0.1.6.tar.gz
Algorithm Hash digest
SHA256 483495696e2351856d71baa0286cfdfed21aa97b9bb1782012d1744b69e3918e
MD5 b6e7757c50c51e7995845dcc2dabfc42
BLAKE2b-256 851d95c0af60f5c3ba4a56ca023c0d22ad501974cb01d4b132d140b91be21ef6

See more details on using hashes here.

File details

Details for the file aioscrapy-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: aioscrapy-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 7.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.25.0 CPython/3.7.4

File hashes

Hashes for aioscrapy-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 fca23fc43a73f72464811a6f6f03db3de98606ac715a2a6f79f912e8c35d6d8f
MD5 34824f879295c7cdd5cd34fde5c8114b
BLAKE2b-256 3907648342c67d07065590b3187488a14e7dee56d47981f0f1e2981c4c7080d0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page