Skip to main content

Official Python SDK for the Crawlora web-scraping API: typed grouped and dynamic operation calls for every public endpoint, with retries, pagination, hooks, and an async client.

Project description

Crawlora Python SDK

Python client for the public Crawlora API. Use it to call Crawlora scraping, search, social, marketplace, media, maps, finance, brand, and usage endpoints with generated type stubs for editor and type-checker support.

  • Runtime: Python 3.10+
  • Auth: x-api-key
  • Default API base URL: https://api.crawlora.net/api/v1
  • Reference: operations and recipes

Install

Published on PyPI. The current release is a prerelease (1.5.0.dev3), so install it with --pre:

pip install --pre crawlora

(Git installs from beta tags also work, e.g. pip install "git+https://github.com/Crawlora-org/crawlora-python-sdk.git@latest".)

API Key

Create or sign in to your Crawlora account at crawlora.net, then create an API key in the dashboard.

read -r CRAWLORA_API_KEY
export CRAWLORA_API_KEY

First Request

import os
from crawlora import CrawloraClient

crawlora = CrawloraClient(api_key=os.environ["CRAWLORA_API_KEY"])

response = crawlora.bing.search(
    q="coffee shops",
    count=10,
)

print(response["data"]["results"][0])

Endpoint groups are generated from the public API contract, so common calls are available as methods such as crawlora.bing.search(...), crawlora.youtube.transcript(...), and crawlora.google.map_search(...).

Typed Dynamic Calls

You can also call by operation id. Literal operation ids are covered by the generated .pyi stubs, so type checkers can infer the matching parameter and response aliases:

response = crawlora.request("bing-search", {
    "q": "coffee shops",
    "count": 10,
})

Generated stubs include operation ids, endpoint groups, keyword parameters, enum values, response aliases, and reserved request options.

Configuration

crawlora = CrawloraClient(
    api_key=os.environ["CRAWLORA_API_KEY"],
    base_url="https://api.crawlora.net/api/v1",
    timeout=30,
    retries=2,
    retry_delay=0.25,
    headers={"x-client": "my-app"},
)

Per-request options are available through reserved keyword arguments. Header names are matched case-insensitively, so request headers can override default auth, user-agent, and content headers without duplicating variants such as x-api-key and X-API-KEY:

response = crawlora.bing.search(
    q="coffee shops",
    _timeout=10,
    _headers={"x-request-id": "search-001"},
)

Text Responses

Most endpoints return JSON. _response_type must be auto, json, or text. Endpoints that support alternate text output, such as YouTube transcripts, can opt into text mode:

transcript = crawlora.youtube.transcript(
    id="VIDEO_ID",
    format="text",
    _response_type="text",
)

print(transcript)

Errors

Failed API calls raise CrawloraError:

from crawlora import CrawloraError

try:
    crawlora.bing.search(q="coffee shops")
except CrawloraError as error:
    print(error.status, error.code, error.body)
    raise

The error includes status, optional API code, parsed body, raw_body, response headers, and the underlying parser or transport exception as __cause__ when available. Retryable responses honor positive Retry-After headers, capped at 30 seconds. Timeout-like transport failures use the Crawlora request timed out SDK message.

CrawloraError has three subclasses for branching on the failure kind: CrawloraClientError (4xx, request rejected), CrawloraServerError (5xx), and CrawloraNetworkError (transport failure or timeout before a response).

Async

AsyncCrawloraClient mirrors the synchronous client for asyncio applications:

from crawlora import AsyncCrawloraClient

crawlora = AsyncCrawloraClient(api_key="YOUR_API_KEY")
result = await crawlora.bing.search(q="coffee shops")

It reuses the same validation, retries, and Retry-After handling, running each request in a worker thread so the package stays dependency-free.

Pagination

client.paginate yields successive pages, advancing the page/offset query parameter and stopping when a page returns no data:

for page in crawlora.paginate("ebay-seller-feedback", {"seller": "acme"}):
    for review in page["data"]:
        print(review)

AsyncCrawloraClient.paginate is the async for equivalent. Override detection with page_param, start, step, and max_pages.

Examples

Runnable examples live under examples/ and skip cleanly when required environment variables are missing:

python3 examples/bing_search.py
python3 examples/youtube_transcript.py

Set CRAWLORA_BASE_URL to point examples at a staging or local API.

Package Notes

The import name is crawlora:

from crawlora import CrawloraClient

The package is published on PyPI as crawlora (a prerelease — pip install --pre crawlora). Git beta tags and the moving latest tag also work, as shown above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlora-1.6.0.dev2.tar.gz (134.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawlora-1.6.0.dev2-py3-none-any.whl (114.2 kB view details)

Uploaded Python 3

File details

Details for the file crawlora-1.6.0.dev2.tar.gz.

File metadata

  • Download URL: crawlora-1.6.0.dev2.tar.gz
  • Upload date:
  • Size: 134.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for crawlora-1.6.0.dev2.tar.gz
Algorithm Hash digest
SHA256 d0621c8a9b57b45c22d0fe3883c114b254fde82e2ce73e2b3833b5caaabbd0b3
MD5 f7eabe17a462580ebe59018f56a5209c
BLAKE2b-256 2596841c15298ab470a8ef02fc9eef634e749353d01e47c4cfb1c8dfdcd157be

See more details on using hashes here.

File details

Details for the file crawlora-1.6.0.dev2-py3-none-any.whl.

File metadata

  • Download URL: crawlora-1.6.0.dev2-py3-none-any.whl
  • Upload date:
  • Size: 114.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for crawlora-1.6.0.dev2-py3-none-any.whl
Algorithm Hash digest
SHA256 372c3049993655c9ea12a31764aed21684da1639296ffb9427f9776ccac852c7
MD5 e80712206c0d734b25d1eeaf6a92dcf3
BLAKE2b-256 63f1c43f4cb3ce9338eb7ad170ff2c7d2e078970404fca17753cd62aa9395ad0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page