Skip to main content

Python asynchronous library for web scraping

Project description

Python async library for web scraping

PyPI version License: MIT

Build Status codecov codebeat badge Codacy Badge

Installing

pip install aioscrapy

Usage

Plain text scraping

import asyncio
import json

from aioscrapy import Client, WebTextClient, SingleSessionPool, Dispatcher, SimpleWorker


class CustomClient(Client[str, dict]):
    def __init__(self, client: WebTextClient):
        self._client = client

    async def fetch(self, key: str) -> dict:
        data = await self._client.fetch(key)
        return json.loads(data)


async def main():
    pool = SingleSessionPool()
    dispatcher = Dispatcher(['https://httpbin.org/get'])
    client = CustomClient(WebTextClient(pool))
    worker = SimpleWorker(dispatcher, client)

    result = await worker.run()
    return result

loop = asyncio.get_event_loop()
print(loop.run_until_complete(main()))

Byte content downloading

import asyncio

from aioscrapy import Client, WebByteClient, SingleSessionPool, Dispatcher, SimpleWorker


class CustomClient(Client[str, bytes]):
    def __init__(self, client: WebByteClient):
        self._client = client

    async def fetch(self, key: str) -> bytes:
        data = await self._client.fetch(key)
        return data


async def main():
    pool = SingleSessionPool()
    dispatcher = Dispatcher(['https://httpbin.org/image'])
    client = CustomClient(WebByteClient(pool))
    worker = SimpleWorker(dispatcher, client)

    result = await worker.run()
    return result

loop = asyncio.get_event_loop()
data: dict = loop.run_until_complete(main())
for url, byte_content in data.items():
    print(url + ": " + str(len(byte_content)) + " bytes")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aioscrapy-0.2.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

aioscrapy-0.2.0-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file aioscrapy-0.2.0.tar.gz.

File metadata

  • Download URL: aioscrapy-0.2.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.7.9

File hashes

Hashes for aioscrapy-0.2.0.tar.gz
Algorithm Hash digest
SHA256 65bafc1c46d3f904bd314a68ae6d256b838c37caaae9cdaf69aa18145f15fd02
MD5 d4a52f2bc216b442caa9700b1743e34f
BLAKE2b-256 718a7bbc2dae9f52946b643157479731415000525bcfeab646e5a42467433f33

See more details on using hashes here.

File details

Details for the file aioscrapy-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: aioscrapy-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.7.9

File hashes

Hashes for aioscrapy-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b33ba670a340b431cff742035bd9cc6cc5be46d99b4c5c934f12690e67b8e618
MD5 618e18414732bc65083c8217fceef4be
BLAKE2b-256 c513541b29f6d7b05540d72cd433f9cbc0e73b31a370bb525141e5b1cdbccd71

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page