Official Python SDK for Krawly — AI-powered web scraping platform
Project description
Krawly — AI-Powered Web Scraping SDK
Turn any website into structured data with AI. No complex selectors, no external API keys — just describe what you want in plain English.
Installation
pip install krawly
Quick Start
from krawly import Krawly
# Initialize with your API key (get one at https://krawly.io)
client = Krawly(api_key="sai_your_key_here")
# One-line scraping — the simplest way
result = client.scrape(
"https://books.toscrape.com",
"Get all book titles and prices"
)
for item in result.data:
print(f"{item['title']}: {item['price']}")
print(f"Total: {result.row_count} items")
Features
- 🤖 AI-Powered — Describe what you want in plain English, the AI handles the rest
- 🔧 No External Keys — Only your Krawly API key needed, no Claude/OpenAI keys
- 📦 Config Management — Save, list, download, and reuse scraping configs
- 🚀 Server Execution — Run scrapers on Krawly's cloud infrastructure
- 📁 Local YAML — Read, write, and upload YAML configs from local files
- 📊 Progress Tracking — Real-time progress callbacks during scraping
Usage
One-Line Scraping
result = client.scrape("https://example.com/products", "Get product names, prices, and ratings")
print(result.data) # [{"name": "...", "price": "...", "rating": "..."}]
Step-by-Step Control
# Step 1: Generate a config
job = client.generate("https://example.com/products", "Get all product details")
# Step 2: Wait with progress updates
def on_progress(status):
print(f"[{status.progress}%] {status.status_message}")
final = client.wait_for_completion(job.job_id, on_progress=on_progress)
print(f"Config generated: {final.config_name}")
print(final.yaml_content)
# Step 3: Run the scraper
run = client.run(final.config_id)
result = client.wait_and_get_results(run.job_id)
print(f"Scraped {result.row_count} items")
Config Management
# List all your configs
configs = client.list_configs()
for c in configs:
print(f"{c.name} — {c.target_url}")
# Get a specific config
config = client.get_config("config-uuid-here")
print(config.yaml_content)
# Create a new config
config = client.create_config(
name="My Scraper",
target_url="https://example.com",
prompt="Get all items",
yaml_content="url: https://example.com\n..."
)
# Delete a config
client.delete_config("config-uuid-here")
Local YAML Files
# Read a local YAML file and run it on the server
result = client.scrape_with_file("my_config.yaml")
print(result.data)
# Download a config from server to local file
client.download_config("config-uuid-here", "downloaded_config.yaml")
# Upload a local YAML file to the server
config = client.upload_config("my_config.yaml", name="My Config")
print(f"Uploaded as: {config.id}")
# Load and parse YAML locally
content = Krawly.load_yaml("config.yaml")
parsed = Krawly.parse_yaml(content)
Run YAML Content Directly
yaml_content = \"""
url: https://books.toscrape.com
selectors:
items: article.product_pod
fields:
title: h3 a::attr(title)
price: .price_color::text
\"""
result = client.scrape_with_yaml(yaml_content)
for book in result.data:
print(book)
Account Info
info = client.me()
print(f"Plan: {info.plan}")
print(f"Credits remaining: {info.generations_remaining}/{info.generations_limit}")
Error Handling
from krawly import Krawly
from krawly.client import AuthenticationError, QuotaExceededError, RateLimitError, KrawlyError
try:
result = client.scrape("https://example.com", "Get data")
except AuthenticationError:
print("Invalid API key")
except QuotaExceededError:
print("No credits remaining — upgrade your plan")
except RateLimitError:
print("Too many requests — try again later")
except KrawlyError as e:
print(f"API error: {e}")
Plans & Pricing
| Plan | Credits | Server Execution | Price |
|---|---|---|---|
| Free | 3/month | ✗ | $0 |
| Starter | 20/month | ✓ | $15/mo |
| Pro | 100/month | ✓ | $29/mo |
All plans include API, SDK, and Chrome Extension access.
Get your API key at krawly.io
Documentation
Full documentation: docs.krawly.io
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file krawly-1.0.2.tar.gz.
File metadata
- Download URL: krawly-1.0.2.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d79a374c011650c7f0fcae601142da614f7080feeab1cc1a9480225a5dcec63
|
|
| MD5 |
e9bd1889c3318e7d41f6912375a03f26
|
|
| BLAKE2b-256 |
23283893d59375337302ebd279a14f2d7bcb76874b0676a6ff580c289866f8c7
|
File details
Details for the file krawly-1.0.2-py3-none-any.whl.
File metadata
- Download URL: krawly-1.0.2-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
554adcefed588615e97cf42141c12a812d8d4b4db70ee3e2ca17d314e635a08f
|
|
| MD5 |
f7e49ef61a019331f7c3ed97452f29a1
|
|
| BLAKE2b-256 |
fe2ac11ea86591202556d6c566f97588c52b877e4aa55ecf55d0701a9ed7cc60
|