Skip to main content

Python SDK for the Reader API

Project description

reader-py

Python SDK for the Reader API — content extraction for LLMs. Wraps POST /v1/read, parses responses into Pydantic models, raises typed exceptions, and auto-polls async jobs to completion.

Version: 0.2.0 · Python: 3.9+

Install

pip install reader-py

Quick start (sync)

import os
from reader_py import ReaderClient

reader = ReaderClient(api_key=os.environ["READER_KEY"])

result = reader.read(url="https://example.com")
if result.kind == "scrape":
    print(result.data.markdown)

Quick start (async)

import asyncio
import os
from reader_py import AsyncReaderClient

async def main():
    async with AsyncReaderClient(api_key=os.environ["READER_KEY"]) as reader:
        result = await reader.read(url="https://example.com")
        if result.kind == "scrape":
            print(result.data.markdown)

asyncio.run(main())

reader.read(...) returns a discriminated union (Pydantic):

  • ScrapeReadResult(kind="scrape", data=ScrapeResult) — single-URL requests, returned immediately
  • JobReadResult(kind="job", data=Job) — batch and crawl requests, auto-polled to completion

Features

  • Sync and async clientsReaderClient (blocking, backed by httpx.Client) and AsyncReaderClient (backed by httpx.AsyncClient). Same method surface.
  • Typed errors for all 11 Reader error codes. InsufficientCreditsError, RateLimitedError, UrlBlockedError, ScrapeTimeoutError, and more. Each subclass exposes the relevant fields (e.g. err.required, err.retry_after_seconds).
  • Automatic retries with exponential backoff for transient codes. Honors the Retry-After header on 429.
  • Pagination-aware job collection. wait_for_job() returns the full job with every page result.
  • SSE streaming. for event in reader.stream(job_id) (sync) or async for (async) yields ProgressEvent / PageEvent / ErrorEvent / DoneEvent.
  • Pydantic models everywhere — all responses are parsed into typed models with IDE autocomplete.
  • Request ID tracing. Every error carries the x-request-id header value on err.request_id for support tickets.

Browser Sessions

Launch a stealthed Chrome and connect Playwright:

session = reader.sessions.create()

from playwright.sync_api import sync_playwright
with sync_playwright() as p:
    browser = p.chromium.connect_over_cdp(session.ws_endpoint)
    page = browser.contexts[0].new_page()
    page.goto("https://example.com")
    print(page.title())
    browser.close()

reader.sessions.stop(session.session_id)

Async:

session = await reader.sessions.create()
# ... use async playwright ...
await reader.sessions.stop(session.session_id)

Methods: reader.sessions.create(), .get(id), .stop(id), .list()

Errors

from reader_py import (
    ReaderApiError,
    InsufficientCreditsError,
    RateLimitedError,
    UrlBlockedError,
)

try:
    reader.read(url=url)
except InsufficientCreditsError as err:
    print(f"Need {err.required}, have {err.available}")
except RateLimitedError as err:
    print(f"Retry after {err.retry_after_seconds}s")
except UrlBlockedError as err:
    print(f"Blocked: {err.reason}")
except ReaderApiError as err:
    print(f"[{err.code}] {err} — see {err.docs_url}")

ReaderError is re-exported as an alias for ReaderApiError so code written against the 0.1 SDK continues to work. New code should use ReaderApiError.

Full catalog of error codes: https://reader.dev/docs/home/concepts/errors

Links

Development

python -m venv .venv && source .venv/bin/activate
pip install -e .[dev]
pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reader_py-0.2.0.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

reader_py-0.2.0-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file reader_py-0.2.0.tar.gz.

File metadata

  • Download URL: reader_py-0.2.0.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for reader_py-0.2.0.tar.gz
Algorithm Hash digest
SHA256 7f91e4faf13968e0ffc15b39f6755789da09fe1f129484a581f14f74378a49ec
MD5 e1e12d7da5d59d31394b5133e4298119
BLAKE2b-256 a56970f40df8df6b67f9fc25c02448a396514fb8df65add2f309ffb18d6ee377

See more details on using hashes here.

File details

Details for the file reader_py-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: reader_py-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for reader_py-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 167b13795d11b40964900f902146bb54d16ef005fde01611be436419fca03663
MD5 061312a5b94e67595185a794380a082b
BLAKE2b-256 53cf3d022864b07702f8d60bcb3cc29faf9407aeeba2194e5c7ff90dd89d6f4a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page