Lightweight cloud SDK for Crawl4AI - mirrors the OSS API
Project description
Crawl4AI Cloud SDK for Python
Lightweight Python SDK for Crawl4AI Cloud. Mirrors the OSS API exactly.
Note: This SDK is for Crawl4AI Cloud (api.crawl4ai.com), the managed cloud service. For the self-hosted open-source version, see github.com/unclecode/crawl4ai.
Installation
pip install crawl4ai-cloud-sdk
Get Your API Key
- Go to api.crawl4ai.com
- Sign up and get your API key
Quick Start
import asyncio
from crawl4ai_cloud import AsyncWebCrawler
async def main():
async with AsyncWebCrawler(api_key="sk_live_...") as crawler:
result = await crawler.run("https://example.com")
print(result.markdown.raw_markdown)
asyncio.run(main())
Features
Single URL Crawl
result = await crawler.run("https://example.com")
print(result.success)
print(result.markdown.raw_markdown)
print(result.html)
Batch Crawl
urls = ["https://example.com", "https://httpbin.org/html"]
# Wait for results
results = await crawler.run_many(urls, wait=True)
for r in results:
print(f"{r.url}: {r.success}")
# Fire and forget (returns job)
job = await crawler.run_many(urls, wait=False)
print(f"Job ID: {job.id}")
Configuration
from crawl4ai_cloud import CrawlerRunConfig, BrowserConfig
config = CrawlerRunConfig(
word_count_threshold=10,
exclude_external_links=True,
screenshot=True,
)
browser_config = BrowserConfig(
viewport_width=1920,
viewport_height=1080,
)
result = await crawler.run(
"https://example.com",
config=config,
browser_config=browser_config,
)
Proxy Support
# Shorthand
result = await crawler.run(url, proxy="datacenter")
result = await crawler.run(url, proxy="residential")
# Full config
result = await crawler.run(url, proxy={
"mode": "residential",
"country": "US"
})
Deep Crawl
result = await crawler.deep_crawl(
"https://docs.example.com",
strategy="bfs",
max_depth=2,
max_urls=50,
wait=True,
)
Job Management
# List jobs
jobs = await crawler.list_jobs(status="completed", limit=10)
# Get job status
job = await crawler.get_job(job_id)
# Wait for job
job = await crawler.wait_job(job_id, poll_interval=2.0)
# Cancel job
await crawler.cancel_job(job_id)
Migration from OSS
Zero learning curve — your existing code works:
# Before (OSS)
from crawl4ai import AsyncWebCrawler
async with AsyncWebCrawler() as crawler:
result = await crawler.arun(url)
# After (Cloud)
from crawl4ai_cloud import AsyncWebCrawler
async with AsyncWebCrawler(api_key="sk_...") as crawler:
result = await crawler.run(url) # arun() also works!
Environment Variables
export CRAWL4AI_API_KEY=sk_live_...
# API key auto-loaded from environment
crawler = AsyncWebCrawler()
Error Handling
from crawl4ai_cloud import (
CloudError,
AuthenticationError,
RateLimitError,
QuotaExceededError,
NotFoundError,
)
try:
result = await crawler.run(url)
except AuthenticationError:
print("Invalid API key")
except RateLimitError as e:
print(f"Rate limited. Retry after {e.retry_after}s")
except QuotaExceededError:
print("Quota exceeded")
Links
- Cloud Dashboard - Sign up & get your API key
- Cloud API Docs - Full API reference
- OSS Repository - Self-hosted option
- Discord - Community & support
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crawl4ai_cloud_sdk-0.2.1.tar.gz.
File metadata
- Download URL: crawl4ai_cloud_sdk-0.2.1.tar.gz
- Upload date:
- Size: 26.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
faab3010d00f7ea077ab0e8c6ec830a56eeb94b85076b6af11eb3b1726d9f5fa
|
|
| MD5 |
e3653427a3122ec281eb7d74a9ffdc13
|
|
| BLAKE2b-256 |
70441ea94bf934b4c4dc1b2cca1bad7f5b0b9894402b7f06c0cc0b4dbf81b283
|
File details
Details for the file crawl4ai_cloud_sdk-0.2.1-py3-none-any.whl.
File metadata
- Download URL: crawl4ai_cloud_sdk-0.2.1-py3-none-any.whl
- Upload date:
- Size: 21.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b04fa943f8c32a4d04e13e8cc5401d4a1bc92e5ff480cdb4226ca7affe7c5ad
|
|
| MD5 |
71dd58bd8d306a8cc69ceb6391f01166
|
|
| BLAKE2b-256 |
72376339316697544f10041059f7d51d02b04caac069f844db30b1c7ba42d3b3
|