A powerful, local-first Python SDK for web scraping, crawling, and data extraction

These details have not been verified by PyPI

Project links

Project description

RapidCrawl

PyPI version Python versions License Stars

A powerful, local-first Python SDK for web scraping, crawling, and data extraction. RapidCrawl provides a comprehensive toolkit for extracting data from websites, handling dynamic content, and converting web pages into clean, structured formats suitable for AI and LLM applications.

Local-first: RapidCrawl runs entirely on your machine. There is no required API key and no hosted service — pip install and start scraping. (api_key/base_url exist only as optional hooks for a possible future hosted backend.)

🚀 Features

🔍 Scrape: Convert any URL into clean markdown, HTML, text, or structured data
🕷️ Crawl: Recursively crawl websites with depth control, filtering, and concurrency
🗺️ Map: Quickly discover all URLs on a website (with sitemap-index support)
🔎 Search: Web search (Google/Bing/DuckDuckGo) with automatic result scraping
📸 Screenshot: Capture full-page screenshots
🎭 Dynamic Content: Handle JavaScript-rendered pages with Playwright
📄 Multiple Formats: Markdown, HTML, text, links, screenshots, PDF, and images
🚄 Async: Genuine concurrency for scrape, crawl, map, search, and extract
🌐 Proxies: Per-client or per-request proxies for both HTTP and the browser
🗃️ Caching: Optional TTL cache (in-memory + on-disk) to skip repeat fetches
💾 Export: Save any result to JSON, CSV, or Markdown
🤖 LLM extraction (optional): Natural-language extraction via OpenAI/Anthropic
🛡️ Error Handling: Typed exceptions and automatic retries
📦 CLI Tool: Feature-rich command-line interface

📋 Table of Contents

Installation
Quick Start
Features
- Scraping
- Crawling
- Mapping
- Searching
CLI Usage
Configuration
API Reference
Examples
Advanced Usage
Performance
Troubleshooting
Development
Contributing
Security
License
Support

📦 Installation

Using pip

pip install rapid-crawl

With optional LLM extraction (OpenAI / Anthropic)

pip install "rapid-crawl[llm]"

With development dependencies

pip install "rapid-crawl[dev]"

From source

git clone https://github.com/aoneahsan/rapid-crawl.git
cd rapid-crawl
pip install -e .

Install Playwright browsers (required for dynamic content)

playwright install chromium

🚀 Quick Start

Python SDK

from rapidcrawl import RapidCrawlApp

# Initialize the client
app = RapidCrawlApp()

# Scrape a single page
result = app.scrape_url("https://example.com")
print(result.content["markdown"])

# Crawl a website
crawl_result = app.crawl_url(
    "https://example.com",
    max_pages=10,
    max_depth=2
)

# Map all URLs
map_result = app.map_url("https://example.com")
print(f"Found {map_result.total_urls} URLs")

# Search and scrape
search_result = app.search(
    "python web scraping",
    num_results=5,
    scrape_results=True
)

Command Line

# Scrape a URL
rapidcrawl scrape https://example.com

# Crawl a website
rapidcrawl crawl https://example.com --max-pages 10

# Map URLs
rapidcrawl map https://example.com --limit 100

# Search
rapidcrawl search "python tutorials" --scrape

🎯 Features

Scraping

Convert any web page into clean, structured data:

from rapidcrawl import RapidCrawlApp, OutputFormat

app = RapidCrawlApp()

# Basic scraping
result = app.scrape_url("https://example.com")

# Multiple formats
result = app.scrape_url(
    "https://example.com",
    formats=["markdown", "html", "screenshot"],
    wait_for=".content",  # Wait for element
    timeout=60000,        # 60 seconds timeout
)

# Extract structured data
result = app.scrape_url(
    "https://example.com/product",
    extract_schema=[
        {"name": "title", "selector": "h1"},
        {"name": "price", "selector": ".price", "type": "number"},
        {"name": "description", "selector": ".description"}
    ]
)

print(result.structured_data)
# {'title': 'Product Name', 'price': 29.99, 'description': '...'}

# Mobile viewport
result = app.scrape_url(
    "https://example.com",
    mobile=True
)

# With actions (click, type, scroll)
result = app.scrape_url(
    "https://example.com",
    actions=[
        {"type": "click", "selector": ".load-more"},
        {"type": "wait", "value": 2000},
        {"type": "scroll", "value": 1000}
    ]
)

Crawling

Recursively crawl websites with advanced filtering:

# Basic crawling
result = app.crawl_url(
    "https://example.com",
    max_pages=50,
    max_depth=3
)

# With URL filtering
result = app.crawl_url(
    "https://example.com",
    include_patterns=[r"/blog/.*", r"/docs/.*"],
    exclude_patterns=[r".*\.pdf$", r".*/tag/.*"]
)

# Async crawling for better performance
import asyncio

async def crawl_async():
    result = await app.crawl_url_async(
        "https://example.com",
        max_pages=100,
        max_depth=5,
        allow_subdomains=True
    )
    return result

result = asyncio.run(crawl_async())

# With webhook notifications
result = app.crawl_url(
    "https://example.com",
    webhook_url="https://your-webhook.com/progress"
)

Mapping

Quickly discover all URLs on a website:

# Basic mapping
result = app.map_url("https://example.com")
print(f"Found {result.total_urls} URLs")

# Filter URLs by search term
result = app.map_url(
    "https://example.com",
    search="product",
    limit=1000
)

# Include subdomains
result = app.map_url(
    "https://example.com",
    include_subdomains=True,
    ignore_sitemap=False  # Use sitemap.xml if available
)

# Access the URLs
for url in result.urls[:10]:
    print(url)

Searching

Search the web and optionally scrape results:

# Basic search
result = app.search("python web scraping tutorial")

# Search with scraping
result = app.search(
    "latest AI news",
    num_results=10,
    scrape_results=True,
    formats=["markdown", "text"]
)

# Access results
for item in result.results:
    print(f"{item.position}. {item.title}")
    print(f"   URL: {item.url}")
    if item.scraped_content:
        print(f"   Content: {item.scraped_content.content['markdown'][:200]}...")

# Different search engines
result = app.search(
    "machine learning",
    engine="duckduckgo",  # or "google", "bing"
    num_results=20
)

# With date filtering
from datetime import datetime, timedelta

result = app.search(
    "tech news",
    start_date=datetime.now() - timedelta(days=7),
    end_date=datetime.now()
)

✨ New in 0.2.0

Proxies

# Global proxy for all requests (HTTP + browser)
app = RapidCrawlApp(proxy="http://user:pass@host:8080")

# Or per request
result = app.scrape_url("https://example.com", proxy="http://host:3128")

Caching

# Enable an in-memory + on-disk TTL cache
app = RapidCrawlApp(cache=True, cache_ttl=3600, cache_dir=".rapidcrawl_cache")

result = app.scrape_url("https://example.com")   # fetched
again  = app.scrape_url("https://example.com")   # served from cache
print(again.from_cache)  # True

# Bypass the cache for a single request
fresh = app.scrape_url("https://example.com", use_cache=False)

Export results

result = app.crawl_url("https://example.com", max_pages=20)

app.export(result, "crawl.json")   # format inferred from extension
app.export(result, "crawl.csv")    # one row per page
app.export(result, "crawl.md")     # human-readable Markdown

Async

import asyncio
from rapidcrawl import RapidCrawlApp

async def main():
    async with RapidCrawlApp() as app:
        result = await app.scrape_url_async("https://example.com")
        crawl  = await app.crawl_url_async("https://example.com", max_pages=50, max_concurrency=10)
        results = await app.extract_async(
            ["https://a.com", "https://b.com"],
            schema=[{"name": "title", "selector": "h1"}],
        )

asyncio.run(main())

LLM-based extraction (optional)

Schema-based (CSS) extraction is keyless and always available. For natural-language extraction, install the extra and provide a key:

pip install "rapid-crawl[llm]"

app = RapidCrawlApp(llm_provider="openai", llm_api_key="sk-...")  # or set OPENAI_API_KEY

result = app.scrape_url(
    "https://example.com/product",
    extract_prompt="Extract the product name, price, and whether it is in stock.",
)
print(result.structured_data)

💻 CLI Usage

RapidCrawl provides a comprehensive command-line interface:

Global options

Options that apply to every command (placed before the subcommand):

rapidcrawl --proxy http://host:8080 --cache --user-agent "MyBot/1.0" scrape https://example.com
rapidcrawl --no-verify-ssl --debug map https://example.com

Setup Wizard

# Interactive setup (optional — writes a .env file)
rapidcrawl setup

Scraping

# Basic scrape
rapidcrawl scrape https://example.com

# Save to file
rapidcrawl scrape https://example.com -o output.md

# Multiple formats
rapidcrawl scrape https://example.com -f markdown -f html -f screenshot

# Wait for element
rapidcrawl scrape https://example.com --wait-for ".content"

# Extract structured data
rapidcrawl scrape https://example.com \
  --extract-schema '[{"name": "title", "selector": "h1"}]'

Crawling

# Basic crawl
rapidcrawl crawl https://example.com

# Advanced crawl
rapidcrawl crawl https://example.com \
  --max-pages 100 \
  --max-depth 3 \
  --include "*/blog/*" \
  --exclude "*.pdf" \
  --output ./crawl-results/

Mapping

# Map all URLs
rapidcrawl map https://example.com

# Filter and save
rapidcrawl map https://example.com \
  --search "product" \
  --limit 1000 \
  --output urls.txt

Searching

# Basic search
rapidcrawl search "python tutorials"

# Search and scrape
rapidcrawl search "machine learning" \
  --scrape \
  --num-results 20 \
  --engine google \
  --output results/

⚙️ Configuration

RapidCrawl works with zero configuration. Everything below is optional.

Environment Variables

Create a .env file in your project root (all optional):

# General
RAPIDCRAWL_TIMEOUT=30
RAPIDCRAWL_MAX_RETRIES=3
RAPIDCRAWL_USER_AGENT=MyBot/1.0

# Proxy (URL string, with optional credentials)
RAPIDCRAWL_PROXY=http://user:pass@host:8080

# Caching
RAPIDCRAWL_CACHE=true
RAPIDCRAWL_CACHE_TTL=3600
RAPIDCRAWL_CACHE_DIR=.rapidcrawl_cache

# Optional LLM extraction (extract_prompt)
RAPIDCRAWL_LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...

Python Configuration

from rapidcrawl import RapidCrawlApp

app = RapidCrawlApp(
    timeout=60.0,
    max_retries=5,
    user_agent="MyBot/1.0",
    proxy="http://user:pass@host:8080",   # also accepts a ProxyConfig or dict
    cache=True,                            # enable result caching
    cache_ttl=3600,
    cache_dir=".rapidcrawl_cache",         # persist cache to disk
    verify_ssl=True,
    debug=True,
    # Optional LLM extraction:
    llm_provider="openai",                 # or "anthropic"
    llm_api_key="sk-...",                  # or set OPENAI_API_KEY / ANTHROPIC_API_KEY
)

Configuration Options

Option	Purpose
`timeout`	Default request timeout in seconds
`max_retries`	Retries for transient network errors
`user_agent`	Override the default browser-like User-Agent
`proxy`	Proxy for HTTP and browser requests
`cache` / `cache_ttl` / `cache_dir`	Result caching
`verify_ssl`	Disable for self-signed certificates
`llm_*`	Optional OpenAI/Anthropic config for `extract_prompt`

📚 API Reference

RapidCrawlApp

The main client class for interacting with RapidCrawl.

Constructor

RapidCrawlApp(
    api_key: Optional[str] = None,       # optional; reserved for a future backend
    base_url: Optional[str] = None,      # optional; reserved for a future backend
    timeout: Optional[float] = None,
    max_retries: Optional[int] = None,
    verify_ssl: bool = True,
    debug: bool = False,
    user_agent: Optional[str] = None,
    proxy: Optional[Union[str, ProxyConfig, dict]] = None,
    cache: bool = False,
    cache_ttl: Optional[float] = None,
    cache_dir: Optional[str] = None,
    llm_provider: Optional[str] = None,  # "openai" | "anthropic"
    llm_api_key: Optional[str] = None,
    llm_model: Optional[str] = None,
)

Methods

Sync	Async	Description
`scrape_url(url, **opts)`	`scrape_url_async(...)`	Scrape a single URL
`crawl_url(url, **opts)`	`crawl_url_async(...)`	Crawl a website (concurrent)
`map_url(url, **opts)`	`map_url_async(...)`	Discover URLs
`search(query, **opts)`	`search_async(...)`	Search the web
`extract(urls, schema, prompt)`	`extract_async(...)`	Extract structured data
`export(result, path, fmt=None)`	`export_async(...)`	Save a result to JSON/CSV/Markdown
`clear_cache()`	—	Clear the result cache

The client is also a context manager (with RapidCrawlApp() as app: / async with ...).

Models

ScrapeOptions

from rapidcrawl.models import ScrapeOptions, OutputFormat

options = ScrapeOptions(
    url="https://example.com",
    formats=[OutputFormat.MARKDOWN, OutputFormat.HTML],
    wait_for=".content",
    timeout=30000,
    mobile=False,
    actions=[...],
    extract_schema=[...],
    headers={"User-Agent": "Custom UA"}
)

CrawlOptions

from rapidcrawl.models import CrawlOptions

options = CrawlOptions(
    url="https://example.com",
    max_pages=100,
    max_depth=3,
    include_patterns=["*/blog/*"],
    exclude_patterns=["*.pdf"],
    allow_subdomains=False,
    webhook_url="https://webhook.example.com"
)

🔧 Examples

For comprehensive examples, see the examples directory:

Basic Scraping - Getting started with web scraping
Web Crawling - Crawling websites recursively
Search and Map - Search and URL mapping
Data Extraction - Structured data extraction
Advanced Usage - Production patterns

E-commerce Price Monitoring

from rapidcrawl import RapidCrawlApp
import json

app = RapidCrawlApp()

# Define extraction schema
schema = [
    {"name": "title", "selector": "h1.product-title"},
    {"name": "price", "selector": ".price-now", "type": "number"},
    {"name": "stock", "selector": ".availability"},
    {"name": "image", "selector": "img.product-image", "attribute": "src"}
]

# Monitor multiple products
products = [
    "https://shop.example.com/product1",
    "https://shop.example.com/product2",
]

results = []
for url in products:
    result = app.scrape_url(url, extract_schema=schema)
    if result.success:
        results.append({
            "url": url,
            "data": result.structured_data,
            "timestamp": result.scraped_at
        })

# Save results
with open("prices.json", "w") as f:
    json.dump(results, f, indent=2, default=str)

Content Aggregation

import asyncio
from rapidcrawl import RapidCrawlApp

app = RapidCrawlApp()

async def aggregate_news():
    # Search multiple queries
    queries = [
        "artificial intelligence breakthroughs",
        "quantum computing news",
        "robotics innovation"
    ]
    
    all_articles = []
    
    for query in queries:
        result = app.search(
            query,
            num_results=5,
            scrape_results=True,
            formats=["markdown"]
        )
        
        for item in result.results:
            if item.scraped_content and item.scraped_content.success:
                all_articles.append({
                    "title": item.title,
                    "url": item.url,
                    "content": item.scraped_content.content["markdown"],
                    "query": query
                })
    
    return all_articles

# Run aggregation
articles = asyncio.run(aggregate_news())

Website Change Detection

import hashlib
import time
from rapidcrawl import RapidCrawlApp

app = RapidCrawlApp()

def monitor_changes(url, interval=3600):
    """Monitor a webpage for changes."""
    previous_hash = None
    
    while True:
        result = app.scrape_url(url, formats=["text"])
        
        if result.success:
            content = result.content["text"]
            current_hash = hashlib.md5(content.encode()).hexdigest()
            
            if previous_hash and current_hash != previous_hash:
                print(f"Change detected at {url}!")
                # Send notification, save diff, etc.
            
            previous_hash = current_hash
        
        time.sleep(interval)

# Monitor a page
monitor_changes("https://example.com/status", interval=300)  # Check every 5 minutes

🚀 Advanced Usage

Rate Limiting

import time
from rapidcrawl import RapidCrawlApp

class RateLimitedScraper:
    def __init__(self, requests_per_second=2):
        self.app = RapidCrawlApp()
        self.min_interval = 1.0 / requests_per_second
        self.last_request = 0
    
    def scrape_url(self, url):
        current = time.time()
        elapsed = current - self.last_request
        if elapsed < self.min_interval:
            time.sleep(self.min_interval - elapsed)
        
        self.last_request = time.time()
        return self.app.scrape_url(url)

Caching Results

from functools import lru_cache
import hashlib

class CachedScraper:
    def __init__(self):
        self.app = RapidCrawlApp()
        self.cache = {}
    
    def scrape_with_cache(self, url, max_age_hours=24):
        cache_key = hashlib.md5(url.encode()).hexdigest()
        
        if cache_key in self.cache:
            cached_time, cached_result = self.cache[cache_key]
            age_hours = (time.time() - cached_time) / 3600
            if age_hours < max_age_hours:
                return cached_result
        
        result = self.app.scrape_url(url)
        self.cache[cache_key] = (time.time(), result)
        return result

Error Handling

from rapidcrawl.exceptions import (
    RateLimitError,
    TimeoutError,
    NetworkError
)

def robust_scrape(url, max_retries=3):
    app = RapidCrawlApp()
    
    for attempt in range(max_retries):
        try:
            return app.scrape_url(url)
        except RateLimitError as e:
            wait_time = e.retry_after or 60
            print(f"Rate limited. Waiting {wait_time}s...")
            time.sleep(wait_time)
        except TimeoutError:
            print(f"Timeout on attempt {attempt + 1}")
            if attempt == max_retries - 1:
                raise
        except NetworkError as e:
            print(f"Network error: {e}")
            time.sleep(2 ** attempt)  # Exponential backoff

Concurrent Scraping

from concurrent.futures import ThreadPoolExecutor, as_completed

def concurrent_scrape(urls, max_workers=5):
    app = RapidCrawlApp()
    results = {}
    
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        future_to_url = {
            executor.submit(app.scrape_url, url): url 
            for url in urls
        }
        
        for future in as_completed(future_to_url):
            url = future_to_url[future]
            try:
                results[url] = future.result()
            except Exception as e:
                results[url] = {"error": str(e)}
    
    return results

For more advanced patterns, see the Advanced Usage Guide.

⚡ Performance

Benchmarks

Operation	URLs	Time	Speed
Sequential Scraping	10	12.3s	0.8 pages/sec
Concurrent Scraping	10	3.1s	3.2 pages/sec
Async Crawling	100	28.5s	3.5 pages/sec
URL Mapping	1000	5.2s	192 URLs/sec

Optimization Tips

Use Async Operations: For crawling large sites, use crawl_url_async()
Enable Connection Pooling: Reuse HTTP connections
Limit Concurrent Requests: Prevent overwhelming servers
Cache Results: Avoid re-scraping unchanged content
Use Specific Formats: Only request needed output formats

🔍 Troubleshooting

Common Issues

Installation Problems

# Update pip
python -m pip install --upgrade pip

# Install in virtual environment
python -m venv venv
source venv/bin/activate
pip install rapid-crawl

Playwright Issues

# Install browser dependencies
playwright install-deps chromium

# Or use Firefox
playwright install firefox

SSL Certificate Errors

# For self-signed certificates (development only!)
app = RapidCrawlApp(verify_ssl=False)

Rate Limiting

# Handle rate limits gracefully
try:
    result = app.scrape_url(url)
except RateLimitError as e:
    time.sleep(e.retry_after or 60)
    result = app.scrape_url(url)

For detailed troubleshooting, see the Troubleshooting Guide.

🛠️ Development

Setting up development environment

# Clone the repository
git clone https://github.com/aoneahsan/rapid-crawl.git
cd rapid-crawl

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Running tests

# Run all tests
pytest

# Run with coverage
pytest --cov=rapidcrawl

# Run specific test file
pytest tests/test_scrape.py

Code formatting

# Format code
black src/rapidcrawl

# Run linter
ruff check src/rapidcrawl

# Type checking
mypy src/rapidcrawl

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Quick Start

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

Development Guidelines

Write tests for new features
Follow PEP 8 style guide
Update documentation
Add type hints
Run tests before submitting

🔒 Security

Security is important to us. Please see our Security Policy for details on:

Reporting vulnerabilities
Security best practices
API key management
Data privacy

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

💬 Support

Documentation

Community

Resources

👨‍💻 Developer

Ahsan Mahmood

🌐 Website: https://aoneahsan.com
📧 Email: aoneahsan@gmail.com
💼 LinkedIn: https://linkedin.com/in/aoneahsan
🐦 Twitter: @aoneahsan

🙏 Acknowledgments

Inspired by Firecrawl
Built with Playwright for dynamic content handling
Uses BeautifulSoup for HTML parsing
Click for the CLI interface
Rich for beautiful terminal output

📊 Statistics

GitHub Stars GitHub Forks PyPI Downloads GitHub Issues GitHub Pull Requests

RapidCrawl - Fast, reliable web scraping for Python
Made with ❤️ by Ahsan Mahmood

⭐ Star us on GitHub • 📦 Install from PyPI • 🐛 Report a Bug

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

May 25, 2026

0.1.0

Jul 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rapid_crawl-0.2.0.tar.gz (108.8 kB view details)

Uploaded May 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rapid_crawl-0.2.0-py3-none-any.whl (51.2 kB view details)

Uploaded May 25, 2026 Python 3

File details

Details for the file rapid_crawl-0.2.0.tar.gz.

File metadata

Download URL: rapid_crawl-0.2.0.tar.gz
Upload date: May 25, 2026
Size: 108.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for rapid_crawl-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`afd0b2f3521c6bfbc231cfa8059a33cfa2ceb8593b79cf01cd3efdd96bd76ba7`
MD5	`b6db8b9cd96871aa3a2be67878da4dca`
BLAKE2b-256	`54e58894c11651aafcfa845fbef2c9f0ab54da72877b242c7c37f561e232f25a`

See more details on using hashes here.

File details

Details for the file rapid_crawl-0.2.0-py3-none-any.whl.

File metadata

Download URL: rapid_crawl-0.2.0-py3-none-any.whl
Upload date: May 25, 2026
Size: 51.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for rapid_crawl-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8b40abe5db4d1aead854028b1791b9e76cc8ca5e09c5cf09a3533dc6fbc24aef`
MD5	`f53bb5d25e38fa9d95c4dcdb3dd9cc55`
BLAKE2b-256	`6b5c9003b3baaf8ee0c9be9df4b8c34c4d35be77827ec5c62d1b747f70aa8782`

See more details on using hashes here.

rapid-crawl 0.2.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

RapidCrawl

🚀 Features

📋 Table of Contents

📦 Installation

Using pip

With optional LLM extraction (OpenAI / Anthropic)

With development dependencies

From source

Install Playwright browsers (required for dynamic content)

🚀 Quick Start

Python SDK

Command Line

🎯 Features

Scraping

Crawling

Mapping

Searching

✨ New in 0.2.0

Proxies

Caching

Export results

Async

LLM-based extraction (optional)

💻 CLI Usage

Global options

Setup Wizard

Scraping

Crawling

Mapping

Searching

⚙️ Configuration

Environment Variables

Python Configuration

Configuration Options

📚 API Reference

RapidCrawlApp

Constructor

Methods

Models

ScrapeOptions

CrawlOptions

🔧 Examples

E-commerce Price Monitoring

Content Aggregation

Website Change Detection

🚀 Advanced Usage

Rate Limiting

Caching Results

Error Handling

Concurrent Scraping

⚡ Performance

Benchmarks

Optimization Tips

🔍 Troubleshooting

Common Issues

Installation Problems

Playwright Issues

SSL Certificate Errors

Rate Limiting

🛠️ Development

Setting up development environment

Running tests

Code formatting

🤝 Contributing

Quick Start

Development Guidelines

🔒 Security

📄 License

💬 Support

Documentation