Skip to main content

Lightweight cloud SDK for Crawl4AI - mirrors the OSS API

Project description

Crawl4AI Cloud SDK for Python

Lightweight Python SDK for Crawl4AI Cloud. Mirrors the OSS API exactly.

Note: This SDK is for Crawl4AI Cloud (api.crawl4ai.com), the managed cloud service. For the self-hosted open-source version, see github.com/unclecode/crawl4ai.

PyPI version Python Version

Installation

pip install crawl4ai-cloud-sdk

Get Your API Key

  1. Go to api.crawl4ai.com
  2. Sign up and get your API key

Quick Start

import asyncio
from crawl4ai_cloud import AsyncWebCrawler

async def main():
    async with AsyncWebCrawler(api_key="sk_live_...") as crawler:
        result = await crawler.run("https://example.com")
        print(result.markdown.raw_markdown)

asyncio.run(main())

Features

Single URL Crawl

result = await crawler.run("https://example.com")
print(result.success)
print(result.markdown.raw_markdown)
print(result.html)

Batch Crawl

urls = ["https://example.com", "https://httpbin.org/html"]

# Wait for results
results = await crawler.run_many(urls, wait=True)
for r in results:
    print(f"{r.url}: {r.success}")

# Fire and forget (returns job)
job = await crawler.run_many(urls, wait=False)
print(f"Job ID: {job.id}")

Configuration

from crawl4ai_cloud import CrawlerRunConfig, BrowserConfig

config = CrawlerRunConfig(
    word_count_threshold=10,
    exclude_external_links=True,
    screenshot=True,
)

browser_config = BrowserConfig(
    viewport_width=1920,
    viewport_height=1080,
)

result = await crawler.run(
    "https://example.com",
    config=config,
    browser_config=browser_config,
)

Proxy Support

# Shorthand
result = await crawler.run(url, proxy="datacenter")
result = await crawler.run(url, proxy="residential")

# Full config
result = await crawler.run(url, proxy={
    "mode": "residential",
    "country": "US"
})

Deep Crawl

result = await crawler.deep_crawl(
    "https://docs.example.com",
    strategy="bfs",
    max_depth=2,
    max_urls=50,
    wait=True,
)

Job Management

# List jobs
jobs = await crawler.list_jobs(status="completed", limit=10)

# Get job status
job = await crawler.get_job(job_id)

# Wait for job
job = await crawler.wait_job(job_id, poll_interval=2.0)

# Cancel job
await crawler.cancel_job(job_id)

Migration from OSS

Zero learning curve — your existing code works:

# Before (OSS)
from crawl4ai import AsyncWebCrawler
async with AsyncWebCrawler() as crawler:
    result = await crawler.arun(url)

# After (Cloud)
from crawl4ai_cloud import AsyncWebCrawler
async with AsyncWebCrawler(api_key="sk_...") as crawler:
    result = await crawler.run(url)  # arun() also works!

Environment Variables

export CRAWL4AI_API_KEY=sk_live_...
# API key auto-loaded from environment
crawler = AsyncWebCrawler()

Error Handling

from crawl4ai_cloud import (
    CloudError,
    AuthenticationError,
    RateLimitError,
    QuotaExceededError,
    NotFoundError,
)

try:
    result = await crawler.run(url)
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except QuotaExceededError:
    print("Quota exceeded")

Links

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawl4ai_cloud_sdk-0.2.1.tar.gz (26.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawl4ai_cloud_sdk-0.2.1-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file crawl4ai_cloud_sdk-0.2.1.tar.gz.

File metadata

  • Download URL: crawl4ai_cloud_sdk-0.2.1.tar.gz
  • Upload date:
  • Size: 26.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for crawl4ai_cloud_sdk-0.2.1.tar.gz
Algorithm Hash digest
SHA256 faab3010d00f7ea077ab0e8c6ec830a56eeb94b85076b6af11eb3b1726d9f5fa
MD5 e3653427a3122ec281eb7d74a9ffdc13
BLAKE2b-256 70441ea94bf934b4c4dc1b2cca1bad7f5b0b9894402b7f06c0cc0b4dbf81b283

See more details on using hashes here.

File details

Details for the file crawl4ai_cloud_sdk-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for crawl4ai_cloud_sdk-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2b04fa943f8c32a4d04e13e8cc5401d4a1bc92e5ff480cdb4226ca7affe7c5ad
MD5 71dd58bd8d306a8cc69ceb6391f01166
BLAKE2b-256 72376339316697544f10041059f7d51d02b04caac069f844db30b1c7ba42d3b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page