Python SDK for the Reader API
Project description
reader-py
Python SDK for the Reader API — content extraction for LLMs. Wraps POST /v1/read, parses responses into Pydantic models, raises typed exceptions, and auto-polls async jobs to completion.
Version: 0.2.0 · Python: 3.9+
Install
pip install reader-py
Quick start (sync)
import os
from reader_py import ReaderClient
reader = ReaderClient(api_key=os.environ["READER_KEY"])
result = reader.read(url="https://example.com")
if result.kind == "scrape":
print(result.data.markdown)
Quick start (async)
import asyncio
import os
from reader_py import AsyncReaderClient
async def main():
async with AsyncReaderClient(api_key=os.environ["READER_KEY"]) as reader:
result = await reader.read(url="https://example.com")
if result.kind == "scrape":
print(result.data.markdown)
asyncio.run(main())
reader.read(...) returns a discriminated union (Pydantic):
ScrapeReadResult(kind="scrape", data=ScrapeResult)— single-URL requests, returned immediatelyJobReadResult(kind="job", data=Job)— batch and crawl requests, auto-polled to completion
Features
- Sync and async clients —
ReaderClient(blocking, backed byhttpx.Client) andAsyncReaderClient(backed byhttpx.AsyncClient). Same method surface. - Typed errors for all 11 Reader error codes.
InsufficientCreditsError,RateLimitedError,UrlBlockedError,ScrapeTimeoutError, and more. Each subclass exposes the relevant fields (e.g.err.required,err.retry_after_seconds). - Automatic retries with exponential backoff for transient codes. Honors the
Retry-Afterheader on 429. - Pagination-aware job collection.
wait_for_job()returns the full job with every page result. - SSE streaming.
for event in reader.stream(job_id)(sync) orasync for(async) yieldsProgressEvent/PageEvent/ErrorEvent/DoneEvent. - Pydantic models everywhere — all responses are parsed into typed models with IDE autocomplete.
- Request ID tracing. Every error carries the
x-request-idheader value onerr.request_idfor support tickets.
Browser Sessions
Launch a stealthed Chrome and connect Playwright:
session = reader.sessions.create()
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.connect_over_cdp(session.ws_endpoint)
page = browser.contexts[0].new_page()
page.goto("https://example.com")
print(page.title())
browser.close()
reader.sessions.stop(session.session_id)
Async:
session = await reader.sessions.create()
# ... use async playwright ...
await reader.sessions.stop(session.session_id)
Methods: reader.sessions.create(), .get(id), .stop(id), .list()
Errors
from reader_py import (
ReaderApiError,
InsufficientCreditsError,
RateLimitedError,
UrlBlockedError,
)
try:
reader.read(url=url)
except InsufficientCreditsError as err:
print(f"Need {err.required}, have {err.available}")
except RateLimitedError as err:
print(f"Retry after {err.retry_after_seconds}s")
except UrlBlockedError as err:
print(f"Blocked: {err.reason}")
except ReaderApiError as err:
print(f"[{err.code}] {err} — see {err.docs_url}")
ReaderError is re-exported as an alias for ReaderApiError so code written against the 0.1 SDK continues to work. New code should use ReaderApiError.
Full catalog of error codes: https://reader.dev/docs/home/concepts/errors
Links
- Docs: https://reader.dev/docs
- SDK reference: https://reader.dev/docs/sdk/python
- API reference: https://reader.dev/docs/api-reference/read
- Discord: https://discord.gg/6tjkq7J5WV
Development
python -m venv .venv && source .venv/bin/activate
pip install -e .[dev]
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file reader_py-0.2.0.tar.gz.
File metadata
- Download URL: reader_py-0.2.0.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f91e4faf13968e0ffc15b39f6755789da09fe1f129484a581f14f74378a49ec
|
|
| MD5 |
e1e12d7da5d59d31394b5133e4298119
|
|
| BLAKE2b-256 |
a56970f40df8df6b67f9fc25c02448a396514fb8df65add2f309ffb18d6ee377
|
File details
Details for the file reader_py-0.2.0-py3-none-any.whl.
File metadata
- Download URL: reader_py-0.2.0-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
167b13795d11b40964900f902146bb54d16ef005fde01611be436419fca03663
|
|
| MD5 |
061312a5b94e67595185a794380a082b
|
|
| BLAKE2b-256 |
53cf3d022864b07702f8d60bcb3cc29faf9407aeeba2194e5c7ff90dd89d6f4a
|