Lightweight async client for Crawl4AI Docker server — no browser dependencies required
Project description
crawl4ai-client
Lightweight async Python client for Crawl4AI Docker server.
No browser dependencies required. Just httpx + pydantic (~2MB vs ~500MB for the full crawl4ai package).
Install
pip install crawl4ai-client
Quick Start
import asyncio
from crawl4ai_client import Crawl4aiDockerClient
async def main():
async with Crawl4aiDockerClient(
base_url="http://localhost:11235",
api_token="your-token", # optional
) as client:
result = await client.crawl(["https://example.com"])
print(result.raw_markdown)
asyncio.run(main())
Features
- Crawl single or multiple URLs (
/crawl) - Stream results as they complete (
/crawl/stream) - Markdown extraction with filters (
/md) - Screenshots as base64 PNG (
/screenshot) - PDF generation (
/pdf) - HTML preprocessing for schema extraction (
/html) - JavaScript execution on pages (
/execute_js) - LLM Q&A — ask questions about page content (
/llm) - Per-URL configs for batch crawling (
crawler_configslist) - Schema retrieval (
/schema) - Async context manager with automatic cleanup
Usage
Basic crawl
from crawl4ai_client import Crawl4aiDockerClient, CrawlerRunConfig, CacheMode
async with Crawl4aiDockerClient(base_url="http://localhost:11235") as client:
result = await client.crawl(
["https://example.com"],
crawler_config=CrawlerRunConfig(cache_mode=CacheMode.BYPASS),
)
print(result.raw_markdown)
Multiple URLs with per-URL configs
from crawl4ai_client import Crawl4aiDockerClient, CrawlerRunConfig
async with Crawl4aiDockerClient(base_url="http://localhost:11235") as client:
results = await client.crawl(
["https://example.com", "https://httpbin.org/html"],
crawler_configs=[
CrawlerRunConfig(word_count_threshold=5),
CrawlerRunConfig(word_count_threshold=50),
],
)
for r in results:
print(f"{r.url}: {len(r.raw_markdown)} chars")
Streaming
async with Crawl4aiDockerClient(base_url="http://localhost:11235") as client:
async for result in client.crawl_stream(["https://example.com", "https://httpbin.org/html"]):
print(f"Got: {result.url}")
Markdown endpoint
async with Crawl4aiDockerClient(base_url="http://localhost:11235") as client:
md = await client.get_markdown("https://example.com", content_filter="fit")
print(md)
Screenshot
async with Crawl4aiDockerClient(base_url="http://localhost:11235") as client:
screenshot_b64 = await client.screenshot("https://example.com")
PDF generation
async with Crawl4aiDockerClient(base_url="http://localhost:11235") as client:
pdf_b64 = await client.get_pdf("https://example.com")
HTML preprocessing
async with Crawl4aiDockerClient(base_url="http://localhost:11235") as client:
html = await client.get_html("https://example.com")
JavaScript execution
async with Crawl4aiDockerClient(base_url="http://localhost:11235") as client:
result = await client.execute_js(
"https://example.com",
scripts=["document.title", "document.querySelectorAll('a').length"],
)
print(result.js_execution_result)
LLM Q&A
async with Crawl4aiDockerClient(base_url="http://localhost:11235") as client:
answer = await client.llm_query(
"https://example.com",
query="What is this page about?",
)
print(answer)
Why this package?
The full crawl4ai package installs 34+ dependencies (~500MB) including Playwright, browsers, numpy, and litellm. If you're running Crawl4AI as a Docker service and only need the client, this package gives you the same Crawl4aiDockerClient with just 2 dependencies.
Compatibility
This client is compatible with Crawl4AI Docker server v0.8.x+. The config classes (BrowserConfig, CrawlerRunConfig) produce the same serialized format as the full library.
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crawl4ai_client-0.1.2.tar.gz.
File metadata
- Download URL: crawl4ai_client-0.1.2.tar.gz
- Upload date:
- Size: 9.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54ae23d4dbd2c353a6fade2bbb90c8ffb4756117cdb5a11e9a498ca4625e3433
|
|
| MD5 |
09e7677a9afe2f1d320f647831f4f5ae
|
|
| BLAKE2b-256 |
e72be39aa5e64c8a02ce1b7e2df510dfc3fd47a89a3831e6f0c9f4465a82f1ff
|
File details
Details for the file crawl4ai_client-0.1.2-py3-none-any.whl.
File metadata
- Download URL: crawl4ai_client-0.1.2-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a8c28e6a1c54d4108991334067907ab665a2bbeab8dc2354e7104fadcdb5a6c9
|
|
| MD5 |
7ffb68df8755aa96523bc5b1f07d5511
|
|
| BLAKE2b-256 |
7c8cc11344a3839f39eb587f5ad8a07dd9c8c573630add04374664f61da69633
|