Skip to main content

Typed async client for the SCP Foundation Wiki's Crom GraphQL API.

Project description

thaumiel

thaumiel mascot (SCP-3000, Anantashesha) generated by Google's Nano Banana 2

PyPI - Python Version PyPI - License PyPI - Status PyPI - Downloads

A typed, ergonomic, read-only async Python client for the SCP Foundation Wiki's Crom GraphQL API.

thaumiel wraps Crom's GraphQL endpoint in a small, fully-typed surface: fetch pages, filter and sort them with a Python DSL, page through results without touching cursors, and budget your rate-limit quota — all async, all type-checked.

Features

  • Fully typed: Frozen Pydantic v2 models (Page, Author, Attribution, ...), checked under pyright strict.
  • Ergonomic filter DSL: Build server-side filters with Python operators: (F.rating >= 100) & (F.tag == "scp"). Illegal filters raise at build time, not at the server.
  • Automatic pagination: pages() is an async iterator that follows Crom's cursors for you; fetch_page_batch() exposes them when you want manual control.
  • Costly-field provenance: Opt into expensive fields per call, and tell "not requested" apart from "server returned null" via page.requested(...).
  • Quota estimation: estimate_* predicts a call's point cost before you spend it.
  • Typed errors and optional retry: A ThaumielError hierarchy plus a configurable RetryPolicy with exponential backoff.

Installation

Requires Python 3.14+.

pip install thaumiel

Quickstart

import asyncio

from thaumiel import AsyncClient


async def main() -> None:
    async with AsyncClient() as client:
        # Crom stores SCP wiki URLs with the http:// scheme.
        page = await client.page("http://scp-wiki.wikidot.com/scp-173")
        if page is None:
            return

        print(page.title, page.rating)
        print(page.tags[:3])


asyncio.run(main())
SCP-173 10752.0
('autonomous', 'ectoentropic', 'euclid')

page() returns None (not an exception) when nothing matches, and takes either a url or a wikidot_id.

Filtering, sorting, and listing

pages() streams every match, following pagination automatically. Combine F accessors into a filter and pass a Sort:

import asyncio

from thaumiel import AsyncClient, F, Sort, SortKey


async def main() -> None:
    # Highest-rated SCP articles on the English wiki.
    query = F.url.starts_with("http://scp-wiki.wikidot.com") & (F.tag == "scp")
    async with AsyncClient() as client:
        shown = 0
        async for page in client.pages(
            filter=query, sort=Sort.by(SortKey.RATING), page_size=5
        ):
            print(f"{page.rating:>6.0f}  {page.title}")
            shown += 1
            if shown == 5:
                break


asyncio.run(main())
 10752  SCP-173
  7145  ●●|●●●●●|●●|●
  5544  SCP-049
  5240  SCP-____-J
  4790  SCP-096

Count matches without fetching them:

await client.count_pages(F.tag == "scp")   # -> 69916

Need the cursor yourself (checkpointing, UI paging)? fetch_page_batch() returns one PageBatch with .pages, .end_cursor, and .has_next_page.

The filter DSL

Each F accessor exposes only the operators its field supports; an unsupported operator or a wrong-typed value raises InvalidPredicateError immediately.

Accessor Field type Operators
F.url prefix string == != .starts_with()
F.title string (case-insensitive) == != .eq_lower() .neq_lower() .starts_with() .starts_with_lower()
F.author string (case-insensitive) same as F.title; matches an attribution's display name
F.category string == !=
F.rating int == != < <= > >=
F.created_at datetime == != < <= > >=
F.is_hidden, F.is_user_page bool == !=
F.tag tag set == (has) != (lacks) .all_of() .any_of() .none_of()

Combine predicates with & (and), | (or), and ~ (not).

[!WARNING] Because ==/>=/... are overloaded, the combinators & | ~ bind looser than the comparisons. Parenthesize every comparison:

(F.rating >= 100) & (F.tag == "scp")   # correct
F.rating >= 100 & F.tag == "scp"       # WRONG: parsed as F.rating >= (100 & F.tag) == "scp"

A predicate lowers to Crom's GraphQL input only when a request is issued, but you can inspect it:

(F.rating >= 100).compile().model_dump(by_alias=True, exclude_unset=True)
# {'onWikidotPage': {'rating': {'gte': 100}}}

Costly fields and quota

Some fields cost extra rate-limit points and are opt-in per call. A field you don't request stays None; some can be None even when requested, so page.requested(...) disambiguates.

from thaumiel import CostlyField

page = await client.page(
    "http://scp-wiki.wikidot.com/scp-173",
    source=True,
    attributions=True,
)

print(len(page.source))                       # 1680
print(page.summary)                           # None
print(page.requested(CostlyField.SUMMARY))    # False  — we never asked for it
credit = page.attributions[0]
print(credit.type.value, credit.user_display_name)   # AUTHOR Moto42

Crom meters usage in points (reported via the x-ratelimit-remaining header; the ceiling is 300000). Estimate before you spend — costly fields in pages() are billed per page:

from thaumiel import estimate_count, estimate_page, estimate_pages

estimate_page(source=True, attributions=True)   # 4
estimate_count()                                # 2
estimate_pages(page_size=100, source=True)      # 200

Errors and retries

Every error subclasses ThaumielError:

from thaumiel import GraphQLError, RateLimitError, TransportError

try:
    page = await client.page(url)
except RateLimitError as exc:      # HTTP 429 (a subclass of TransportError)
    ...
except TransportError as exc:      # other HTTP/network failure; .status_code, .cause
    ...
except GraphQLError as exc:        # query-level errors; .errors
    ...

Every call is read-only and idempotent, so retrying is safe. RetryPolicy backs off exponentially on rate limits (and optionally on 5xx):

from thaumiel import AsyncClient, RetryPolicy

policy = RetryPolicy(max_attempts=4, backoff=0.5)
async with AsyncClient() as client:
    # Pass a factory, not a coroutine: a retry needs a fresh awaitable.
    page = await policy.run(lambda: client.page("http://scp-wiki.wikidot.com/scp-173"))

Configuration

from thaumiel import AsyncClient

client = AsyncClient(
    user_agent="my-app/1.0 (me@example.com)",   # good Crom etiquette
    timeout=30.0,
)

For full control — connection limits, event hooks, observing quota headers — inject your own httpx.AsyncClient. thaumiel will not close a client it did not create:

import httpx
from thaumiel import AsyncClient

http = httpx.AsyncClient(headers={"User-Agent": "my-app/1.0"})
client = AsyncClient(http_client=http)
# ... use client ...
await http.aclose()   # you own it; you close it

More end-to-end scripts live in examples/.

Limitations

  • Read-only: thaumiel offers no writes.
  • Async only: There is no synchronous client.
  • Wikidot pages only: pages() skips non-Wikidot nodes (e.g. RuFoundation), so it can yield fewer rows than count_pages reports for the same filter.
  • Curated filter surface: Only the fields in the table above are filterable, and some support equality only.
  • Quota-bound: Requests cost points against Crom's quota; budget with estimate_*.
  • Alpha: While on 0.x, the public API may change before 1.0.

Development

See .github/CONTRIBUTING.md.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thaumiel-0.1.0.tar.gz (42.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thaumiel-0.1.0-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file thaumiel-0.1.0.tar.gz.

File metadata

  • Download URL: thaumiel-0.1.0.tar.gz
  • Upload date:
  • Size: 42.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for thaumiel-0.1.0.tar.gz
Algorithm Hash digest
SHA256 10dfb9de96b67fda66181d74d2829118419ac4d2e072af70e6eea0a4a0d4f8e8
MD5 2ff2c97e0229adc011c4e34a8574661f
BLAKE2b-256 847c3ca7cb71b31cd0f73b33f4e9e45b989ce497a9da72054c38381071039a5f

See more details on using hashes here.

Provenance

The following attestation bundles were made for thaumiel-0.1.0.tar.gz:

Publisher: release.yml on ozefe/thaumiel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file thaumiel-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: thaumiel-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 31.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for thaumiel-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dc25173acffd3ee319793232759c16c42816a4dad603760424141803936bcc17
MD5 a531a55cba0ecbd187102f94a63009a0
BLAKE2b-256 25f1baea31c6fbe651196a85f226c660ab0559d1855ff7b8ba965929a69a3920

See more details on using hashes here.

Provenance

The following attestation bundles were made for thaumiel-0.1.0-py3-none-any.whl:

Publisher: release.yml on ozefe/thaumiel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page