Skip to main content

Lightweight cloud SDK for Crawl4AI - mirrors the OSS API

Project description

Crawl4AI Cloud SDK for Python

Lightweight Python SDK for Crawl4AI Cloud. Mirrors the OSS API exactly.

Note: This SDK is for Crawl4AI Cloud (api.crawl4ai.com), the managed cloud service. For the self-hosted open-source version, see github.com/unclecode/crawl4ai.

PyPI version Python Version

Claude Code Plugin

Use Crawl4AI directly inside Claude Code with 9 built-in tools — no Python needed. See the plugin README for details.

/plugin marketplace add unclecode/crawl4ai-cloud-sdk
/plugin install crawl4ai@crawl4ai-claude-plugins

Installation

pip install crawl4ai-cloud-sdk

Get Your API Key

  1. Go to api.crawl4ai.com
  2. Sign up and get your API key

Quick Start

import asyncio
from crawl4ai_cloud import AsyncWebCrawler

async def main():
    async with AsyncWebCrawler(api_key="sk_live_...") as crawler:
        result = await crawler.run("https://example.com")
        print(result.markdown.raw_markdown)

asyncio.run(main())

Features

Single URL Crawl

result = await crawler.run("https://example.com")
print(result.success)
print(result.markdown.raw_markdown)
print(result.html)

Batch Crawl

urls = ["https://example.com", "https://httpbin.org/html"]

# Wait for results
results = await crawler.run_many(urls, wait=True)
for r in results:
    print(f"{r.url}: {r.success}")

# Fire and forget (returns job)
job = await crawler.run_many(urls, wait=False)
print(f"Job ID: {job.id}")

Configuration

from crawl4ai_cloud import CrawlerRunConfig, BrowserConfig

config = CrawlerRunConfig(
    word_count_threshold=10,
    exclude_external_links=True,
    screenshot=True,
)

browser_config = BrowserConfig(
    viewport_width=1920,
    viewport_height=1080,
)

result = await crawler.run(
    "https://example.com",
    config=config,
    browser_config=browser_config,
)

Proxy Support

# Shorthand
result = await crawler.run(url, proxy="datacenter")
result = await crawler.run(url, proxy="residential")

# Full config
result = await crawler.run(url, proxy={
    "mode": "residential",
    "country": "US"
})

Deep Crawl

result = await crawler.deep_crawl(
    "https://docs.example.com",
    strategy="bfs",
    max_depth=2,
    max_urls=50,
    wait=True,
)

Job Management

# List jobs
jobs = await crawler.list_jobs(status="completed", limit=10)

# Get job status
job = await crawler.get_job(job_id)

# Wait for job
job = await crawler.wait_job(job_id, poll_interval=2.0)

# Cancel job
await crawler.cancel_job(job_id)

Migration from OSS

Zero learning curve — your existing code works:

# Before (OSS)
from crawl4ai import AsyncWebCrawler
async with AsyncWebCrawler() as crawler:
    result = await crawler.arun(url)

# After (Cloud)
from crawl4ai_cloud import AsyncWebCrawler
async with AsyncWebCrawler(api_key="sk_...") as crawler:
    result = await crawler.run(url)  # arun() also works!

Environment Variables

export CRAWL4AI_API_KEY=sk_live_...
# API key auto-loaded from environment
crawler = AsyncWebCrawler()

Error Handling

from crawl4ai_cloud import (
    CloudError,
    AuthenticationError,
    RateLimitError,
    QuotaExceededError,
    NotFoundError,
)

try:
    result = await crawler.run(url)
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except QuotaExceededError:
    print("Quota exceeded")

Links

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawl4ai_cloud_sdk-0.2.7.tar.gz (40.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawl4ai_cloud_sdk-0.2.7-py3-none-any.whl (37.0 kB view details)

Uploaded Python 3

File details

Details for the file crawl4ai_cloud_sdk-0.2.7.tar.gz.

File metadata

  • Download URL: crawl4ai_cloud_sdk-0.2.7.tar.gz
  • Upload date:
  • Size: 40.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for crawl4ai_cloud_sdk-0.2.7.tar.gz
Algorithm Hash digest
SHA256 afc2410ce340617790acb0dc7425e90cde19cb86bf3ff7c6541b0248ad9033fa
MD5 93b8b549f8d94cd26807cf8c49b95e3c
BLAKE2b-256 99e8683a31cca72033b7339db2e824d189953563b8f658c864c6066b41d5f935

See more details on using hashes here.

File details

Details for the file crawl4ai_cloud_sdk-0.2.7-py3-none-any.whl.

File metadata

File hashes

Hashes for crawl4ai_cloud_sdk-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 fa8ffa81278fbbcef44219d2e6b5b68327c724376bcb29185cd56abbca89a6c1
MD5 095a9ae21f957d2cf4a9634b8c475152
BLAKE2b-256 a747604b1bbc854db038f882e0ddb3da6ed8bb82694b32857f1e33fa508634b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page