Skip to main content

Async framework for building modular and scalable web scrapers.

Project description

aioscraper

aioscraper logo

Python License: MIT PyPI Tests Documentation Status GitHub last commit

Asynchronous framework for building modular and scalable web scrapers.

Beta notice: APIs and behavior may change; expect sharp edges while things settle.

Key Features

  • Async-first core with pluggable HTTP backends (aiohttp/httpx) and aiojobs scheduling
  • Declarative flow: requests → callbacks → pipelines, with middleware hooks
  • Priority queueing plus configurable concurrency
  • Small, explicit API that is easy to test and compose

Getting started

Install

pip install "aioscraper[aiohttp]"
# or use httpx as the HTTP backend
pip install "aioscraper[httpx]"

Create scraper.py:

from aioscraper import AIOScraper, Request, Response, SendRequest

scraper = AIOScraper()

@scraper
async def scrape(send_request: SendRequest):
    await send_request(Request(url="https://example.com", callback=handle_response))


async def handle_response(response: Response):
    print(f"Fetched {response.url} with status {response.status}")

Run it

aioscraper scraper

Documentation

Why aioscraper?

  • Scrapy is mature but tied to Twisted and a heavier, older stack. aioscraper is plain asyncio with modern typing and explicit control flow.
  • Less magic: declarative Request → callback → pipeline without opaque spider classes; each piece is a normal function or typed class, simple to test and mock.
  • Light footprint: pluggable HTTP backend (aiohttp/httpx), no global settings or hidden state, no vendor lock-in.
  • Built for modern workloads: high-volume API/JSON crawling, fanning out to microservice endpoints, quick data collection jobs where you want async throughput without a large framework.
  • Easy to embed: runs inside existing async apps (FastAPI, workers, cron jobs) without adapting to a separate runtime.

Use cases

  • Collecting data from many JSON/REST endpoints concurrently
  • Fan-out calls inside microservices to hydrate/cache data
  • Lightweight scraping jobs that should be easy to test and ship (no big framework overhead)
  • Benchmarks show stable throughput across CPython 3.11–3.14 (see the benchmarks)

Contributing

Please see the Contributing guide for workflow, tooling, and review expectations.

License

MIT License

Copyright (c) 2025 darkstussy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aioscraper-0.9.0.tar.gz (31.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aioscraper-0.9.0-py3-none-any.whl (33.9 kB view details)

Uploaded Python 3

File details

Details for the file aioscraper-0.9.0.tar.gz.

File metadata

  • Download URL: aioscraper-0.9.0.tar.gz
  • Upload date:
  • Size: 31.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aioscraper-0.9.0.tar.gz
Algorithm Hash digest
SHA256 c9b18d3fde01c39103534defd66ad482612aceeb14fe880031b6456687d8303b
MD5 72e78126568beaad96f87a9b68c7e2b5
BLAKE2b-256 91293a2a91869c93c592a9181d8ea86e72fc9dcc1579217bff8f420c29df5c91

See more details on using hashes here.

Provenance

The following attestation bundles were made for aioscraper-0.9.0.tar.gz:

Publisher: release.yml on DarkStussy/aioscraper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aioscraper-0.9.0-py3-none-any.whl.

File metadata

  • Download URL: aioscraper-0.9.0-py3-none-any.whl
  • Upload date:
  • Size: 33.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aioscraper-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e132ed6ef940fb2749f11da7e7247fd1cbc73445d32f64456593a382e399c586
MD5 5da8574824e27a80055233b1a9d5c71b
BLAKE2b-256 db91fdb9de978f5b4d33ece43fe9441976c3e0358d5abef83caacdc84c210d66

See more details on using hashes here.

Provenance

The following attestation bundles were made for aioscraper-0.9.0-py3-none-any.whl:

Publisher: release.yml on DarkStussy/aioscraper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page