Official Python SDK for CrawlKit — Web + Video Intelligence API for AI
Project description
CrawlKit Python SDK
Official Python SDK for CrawlKit — Web + Video Intelligence API for AI.
Installation
pip install crawlkit
Quick Start
from crawlkit import CrawlKit
# Initialize client
ck = CrawlKit(api_key="ck_free_xxx")
# Scrape a webpage
result = ck.scrape("https://vnexpress.net/thoi-su")
print(result.title)
print(result.content[:200])
# Scrape YouTube video transcript
video = ck.scrape("https://youtu.be/-Td-D-vKJDg")
print(video.content) # Full transcript
# Batch scrape multiple URLs
results = ck.batch([
"https://vnexpress.net",
"https://cafef.vn",
])
for result in results:
print(f"{result.title}: {len(result.content)} chars")
Async Support
import asyncio
from crawlkit import AsyncCrawlKit
async def main():
async with AsyncCrawlKit(api_key="ck_free_xxx") as ck:
result = await ck.scrape("https://example.com")
print(result.title)
asyncio.run(main())
Features
- 🚀 Simple & Fast — Clean API with sync and async support
- 🎥 Video Intelligence — Extract transcripts from YouTube videos
- 📄 Smart Parsing — Automatic content extraction from any webpage
- 🔄 Batch Processing — Scrape multiple URLs efficiently
- 🔗 Link Discovery — Find related links on any page
- 💪 Type Safe — Full type hints support
- 🛡️ Error Handling — Automatic retries and custom exceptions
API Reference
CrawlKit(api_key, base_url=...)
Main client class.
Methods:
scrape(url, chunk=False, chunk_size=1000, parser=None)
Scrape a single URL.
Parameters:
url(str): URL to scrapechunk(bool): Split content into chunkschunk_size(int): Size of each chunkparser(str): Specific parser to use
Returns: ScrapeResult
batch(urls, chunk=False, chunk_size=1000)
Scrape multiple URLs in one request.
Parameters:
urls(list[str]): List of URLs to scrapechunk(bool): Split content into chunkschunk_size(int): Size of each chunk
Returns: list[ScrapeResult]
discover(url, limit=20)
Discover links from a page.
Parameters:
url(str): URL to discover links fromlimit(int): Maximum number of links
Returns: list[str]
health()
Check API health status.
Returns: dict
parsers()
List available parsers.
Returns: list[ParserInfo]
usage()
Get your API usage statistics.
Returns: UsageStats
Examples
Chunked Content
result = ck.scrape(
"https://en.wikipedia.org/wiki/Python",
chunk=True,
chunk_size=500
)
for i, chunk in enumerate(result.chunks):
print(f"Chunk {i+1}: {chunk[:50]}...")
Error Handling
from crawlkit import CrawlKit, RateLimitError, AuthenticationError
try:
result = ck.scrape("https://example.com")
except RateLimitError as e:
print(f"Rate limited! Retry after {e.retry_after}s")
except AuthenticationError:
print("Invalid API key")
Context Manager
with CrawlKit(api_key="ck_free_xxx") as ck:
result = ck.scrape("https://example.com")
# Client automatically closes
Get an API Key
- Visit crawlkit.vercel.app
- Sign up for a free account
- Get your API key from the dashboard
Free tier includes:
- 100 requests/day
- Web scraping
- Video transcripts
- All parsers
License
MIT License - see LICENSE file for details.
Links
- 🌐 Website
- 📚 Documentation
- 🐛 Issues
- 💬 Support
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crawlkit-0.1.0.tar.gz.
File metadata
- Download URL: crawlkit-0.1.0.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ec3f6347a4b3f481f0d4d87462b8ca6a1193e065a0a64cf9d7dcab6ccc15fe1
|
|
| MD5 |
cf671bbaba052fd8fa44767b09e2e192
|
|
| BLAKE2b-256 |
3ee0265087735a55ec8cfb0dcf861879819fca8a7ca69eca066154a6c8a56636
|
File details
Details for the file crawlkit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: crawlkit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b7b6213ecfc7df675be678c6cf9277f3293aff60894eceec4dd61a38d4db120
|
|
| MD5 |
0c3ad3efe55bc6127d992edea0abf07b
|
|
| BLAKE2b-256 |
e2d7d97a3535799ddfe2f997a62ef5c2bca3ba5c1a68a1c43f75dc7cabab8801
|