Skip to main content

Lightweight cloud SDK for Crawl4AI - mirrors the OSS API

Project description

Crawl4AI Cloud SDK for Python

Lightweight Python SDK for Crawl4AI Cloud. Mirrors the OSS API exactly.

Note: This SDK is for Crawl4AI Cloud (api.crawl4ai.com), the managed cloud service. For the self-hosted open-source version, see github.com/unclecode/crawl4ai.

PyPI version Python Version

Claude Code Plugin

Use Crawl4AI directly inside Claude Code with 9 built-in tools — no Python needed. See the plugin README for details.

/plugin marketplace add unclecode/crawl4ai-cloud-sdk
/plugin install crawl4ai@crawl4ai-claude-plugins

Installation

pip install crawl4ai-cloud-sdk

Get Your API Key

  1. Go to api.crawl4ai.com
  2. Sign up and get your API key

Quick Start

import asyncio
from crawl4ai_cloud import AsyncWebCrawler

async def main():
    async with AsyncWebCrawler(api_key="sk_live_...") as crawler:
        result = await crawler.run("https://example.com")
        print(result.markdown.raw_markdown)

asyncio.run(main())

Features

Single URL Crawl

result = await crawler.run("https://example.com")
print(result.success)
print(result.markdown.raw_markdown)
print(result.html)

Batch Crawl

urls = ["https://example.com", "https://httpbin.org/html"]

# Wait for results
results = await crawler.run_many(urls, wait=True)
for r in results:
    print(f"{r.url}: {r.success}")

# Fire and forget (returns job)
job = await crawler.run_many(urls, wait=False)
print(f"Job ID: {job.id}")

Configuration

from crawl4ai_cloud import CrawlerRunConfig, BrowserConfig

config = CrawlerRunConfig(
    word_count_threshold=10,
    exclude_external_links=True,
    screenshot=True,
)

browser_config = BrowserConfig(
    viewport_width=1920,
    viewport_height=1080,
)

result = await crawler.run(
    "https://example.com",
    config=config,
    browser_config=browser_config,
)

Proxy Support

# Shorthand
result = await crawler.run(url, proxy="datacenter")
result = await crawler.run(url, proxy="residential")

# Full config
result = await crawler.run(url, proxy={
    "mode": "residential",
    "country": "US"
})

Deep Crawl

result = await crawler.deep_crawl(
    "https://docs.example.com",
    strategy="bfs",
    max_depth=2,
    max_urls=50,
    wait=True,
)

Job Management

# List jobs
jobs = await crawler.list_jobs(status="completed", limit=10)

# Get job status
job = await crawler.get_job(job_id)

# Wait for job
job = await crawler.wait_job(job_id, poll_interval=2.0)

# Cancel job
await crawler.cancel_job(job_id)

Migration from OSS

Zero learning curve — your existing code works:

# Before (OSS)
from crawl4ai import AsyncWebCrawler
async with AsyncWebCrawler() as crawler:
    result = await crawler.arun(url)

# After (Cloud)
from crawl4ai_cloud import AsyncWebCrawler
async with AsyncWebCrawler(api_key="sk_...") as crawler:
    result = await crawler.run(url)  # arun() also works!

Environment Variables

export CRAWL4AI_API_KEY=sk_live_...
# API key auto-loaded from environment
crawler = AsyncWebCrawler()

Error Handling

from crawl4ai_cloud import (
    CloudError,
    AuthenticationError,
    RateLimitError,
    QuotaExceededError,
    NotFoundError,
)

try:
    result = await crawler.run(url)
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except QuotaExceededError:
    print("Quota exceeded")

Links

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawl4ai_cloud_sdk-0.2.6.tar.gz (39.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawl4ai_cloud_sdk-0.2.6-py3-none-any.whl (36.8 kB view details)

Uploaded Python 3

File details

Details for the file crawl4ai_cloud_sdk-0.2.6.tar.gz.

File metadata

  • Download URL: crawl4ai_cloud_sdk-0.2.6.tar.gz
  • Upload date:
  • Size: 39.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for crawl4ai_cloud_sdk-0.2.6.tar.gz
Algorithm Hash digest
SHA256 142be946f1ee376092ea1caa752fc9491dd281eec4a11ded9fe3310420465957
MD5 06e9248dce804912684d973917d79d72
BLAKE2b-256 d4d314f2f4727ce24d9db0ae3e838d81e22e11c6ef41391cb9b31d2e4ec484ea

See more details on using hashes here.

File details

Details for the file crawl4ai_cloud_sdk-0.2.6-py3-none-any.whl.

File metadata

File hashes

Hashes for crawl4ai_cloud_sdk-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 7952783ddd573a5392cbdde07560e25d2000ae11312b56a71d91b093da87070e
MD5 e82cd4be83d22bec7ddd4ff5267fb3e3
BLAKE2b-256 212ed6799e8dfaf2b3820097e8071fe0cc4d5ea1d8fffa60d887c34eae43ce21

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page