Official Python SDK for the AlterLab Web Scraping API - Extract data from any website with intelligent anti-bot bypass

These details have not been verified by PyPI

Project links

Project description

AlterLab Python SDK

Official Python SDK for the AlterLab Web Scraping API. Extract data from any website with intelligent anti-bot bypass, JavaScript rendering, and structured extraction.

Features

Simple API: 3 lines of code to scrape any website
Intelligent Anti-Bot Bypass: Automatic tier escalation (curl → HTTP → stealth → browser)
JavaScript Rendering: Full Playwright browser for JS-heavy sites
Structured Extraction: JSON Schema, prompts, and pre-built profiles
BYOP Support: Bring Your Own Proxy for 20% discount
Async Support: Native asyncio for concurrent scraping
Type Hints: Full typing support for IDE autocomplete
Cost Controls: Set budgets, prefer cost/speed, fail-fast options

Installation

pip install alterlab

Quick Start

from alterlab import AlterLab

# Initialize client
client = AlterLab(api_key="sk_live_...")  # or set ALTERLAB_API_KEY env var

# Scrape a website
result = client.scrape("https://example.com")
print(result.text)          # Extracted text
print(result.json)          # Structured JSON (Schema.org, metadata)
print(result.billing.cost_dollars)  # Cost breakdown

Pricing

Pay-as-you-go pricing with no subscriptions. $1 = 5,000 scrapes (Tier 1).

Tier	Name	Price	Per $1	Use Case
1	Curl	$0.0002	5,000	Static HTML sites
2	HTTP	$0.0003	3,333	Sites with TLS fingerprinting
3	Stealth	$0.0005	2,000	Sites with browser checks
4	Browser	$0.001	1,000	JS-heavy SPAs
5	Captcha	$0.02	50	Sites with CAPTCHAs

The API automatically escalates through tiers until successful, charging only for the tier used.

Usage Examples

Basic Scraping

from alterlab import AlterLab

client = AlterLab(api_key="sk_live_...")

# Auto mode - intelligent tier escalation
result = client.scrape("https://example.com")

# Force HTML-only (fastest, cheapest)
result = client.scrape_html("https://example.com")

# JavaScript rendering
result = client.scrape_js("https://spa-app.com", screenshot=True)
print(result.screenshot_url)

Structured Extraction

# Extract specific fields with JSON Schema
result = client.scrape(
    "https://store.com/product/123",
    extraction_schema={
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "price": {"type": "number"},
            "in_stock": {"type": "boolean"}
        }
    }
)
print(result.json)  # {"name": "...", "price": 29.99, "in_stock": true}

# Or use a pre-built profile
result = client.scrape(
    "https://store.com/product/123",
    extraction_profile="product"
)

Cost Controls

from alterlab import AlterLab, CostControls

client = AlterLab(api_key="sk_live_...")

# Limit to cheap tiers only
result = client.scrape(
    "https://example.com",
    cost_controls=CostControls(
        max_tier="2",       # Don't go above HTTP tier
        prefer_cost=True,   # Optimize for lowest cost
        fail_fast=True      # Error instead of escalating
    )
)

# Estimate cost before scraping
estimate = client.estimate_cost("https://linkedin.com")
print(f"Estimated: ${estimate.estimated_cost_dollars:.4f}")
print(f"Confidence: {estimate.confidence}")

Advanced Options

from alterlab import AlterLab, AdvancedOptions

client = AlterLab(api_key="sk_live_...")

# Full browser with screenshot and PDF
result = client.scrape(
    "https://example.com",
    mode="js",
    advanced=AdvancedOptions(
        render_js=True,
        screenshot=True,
        generate_pdf=True,
        markdown=True,
        wait_condition="networkidle"
    )
)

print(result.screenshot_url)
print(result.pdf_url)
print(result.markdown_content)

BYOP (Bring Your Own Proxy)

Get 20% discount when using your own proxy:

from alterlab import AlterLab, AdvancedOptions

client = AlterLab(api_key="sk_live_...")

# Use your configured proxy integration
result = client.scrape(
    "https://example.com",
    advanced=AdvancedOptions(
        use_own_proxy=True,
        proxy_country="US"  # Optional: request specific geo
    )
)

# Check if BYOP was applied
if result.billing.byop_applied:
    print(f"Saved {result.billing.byop_discount_percent}%!")

Async Support

import asyncio
from alterlab import AsyncAlterLab

async def main():
    async with AsyncAlterLab(api_key="sk_live_...") as client:
        # Single request
        result = await client.scrape("https://example.com")

        # Concurrent requests
        urls = [
            "https://example.com/page1",
            "https://example.com/page2",
            "https://example.com/page3",
        ]
        results = await asyncio.gather(*[client.scrape(url) for url in urls])

        for r in results:
            print(r.title, r.billing.cost_dollars)

asyncio.run(main())

Caching

# Enable caching (opt-in)
result = client.scrape(
    "https://example.com",
    cache=True,          # Enable caching
    cache_ttl=3600,      # Cache for 1 hour
)

if result.cached:
    print("Cache hit - no credits charged!")

# Force refresh
result = client.scrape(
    "https://example.com",
    cache=True,
    force_refresh=True   # Bypass cache
)

PDF and Image Extraction

# Extract text from PDF
result = client.scrape_pdf(
    "https://example.com/document.pdf",
    format="markdown"
)
print(result.text)

# OCR for images
result = client.scrape_ocr(
    "https://example.com/image.png",
    language="eng"
)
print(result.text)

Error Handling

from alterlab import (
    AlterLab,
    AuthenticationError,
    InsufficientCreditsError,
    RateLimitError,
    ScrapeError,
    TimeoutError
)

client = AlterLab(api_key="sk_live_...")

try:
    result = client.scrape("https://example.com")
except AuthenticationError:
    print("Invalid API key")
except InsufficientCreditsError:
    print("Please top up your balance")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except ScrapeError as e:
    print(f"Scraping failed: {e.message}")
except TimeoutError:
    print("Request timed out")

Check Usage & Balance

usage = client.get_usage()
print(f"Balance: ${usage.balance_dollars:.2f}")
print(f"Used this month: {usage.credits_used_month} credits")

API Reference

AlterLab Client

AlterLab(
    api_key: str = None,           # API key (or ALTERLAB_API_KEY env var)
    base_url: str = None,          # Custom API URL
    timeout: int = 120,            # Request timeout in seconds
    max_retries: int = 3,          # Retry count for transient failures
    retry_delay: float = 1.0       # Initial retry delay (exponential backoff)
)

scrape() Method

client.scrape(
    url: str,                      # URL to scrape
    mode: str = "auto",            # "auto", "html", "js", "pdf", "ocr"
    sync: bool = True,             # Wait for result vs return job ID
    advanced: AdvancedOptions,     # Advanced scraping options
    cost_controls: CostControls,   # Budget and optimization settings
    cache: bool = False,           # Enable response caching
    cache_ttl: int = None,         # Cache TTL in seconds (60-86400)
    formats: list = None,          # Output formats: ["text", "json", "html", "markdown"]
    extraction_schema: dict,       # JSON Schema for structured extraction
    extraction_prompt: str,        # Natural language extraction instructions
    extraction_profile: str,       # Pre-built profile: "product", "article", etc.
    wait_for: str = None,          # CSS selector to wait for (JS mode)
    screenshot: bool = False,      # Capture screenshot (JS mode)
) -> ScrapeResult

ScrapeResult

result.url                # Scraped URL
result.status_code        # HTTP status
result.text               # Extracted text content
result.html               # HTML content
result.json               # Structured JSON content
result.title              # Page title
result.author             # Author (if detected)
result.billing            # BillingDetails object
result.billing.tier_used  # Tier that succeeded
result.billing.cost_dollars  # Final cost in USD
result.screenshot_url     # Screenshot URL (if requested)
result.pdf_url            # PDF URL (if requested)
result.cached             # Whether result was from cache

Environment Variables

Variable	Description
`ALTERLAB_API_KEY`	Your API key (alternative to passing in constructor)

Requirements

Python 3.8+
httpx >= 0.24.0

Support

Documentation: https://alterlab.io/docs
API Status: https://status.alterlab.io
Support: support@alterlab.io
Issues: GitHub Issues

License

MIT License - see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.2.0

May 7, 2026

2.1.1

May 7, 2026

2.1.0

Mar 29, 2026

2.0.1

Jan 26, 2026

This version

2.0.0

Jan 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alterlab-2.0.0.tar.gz (15.7 kB view details)

Uploaded Jan 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

alterlab-2.0.0-py3-none-any.whl (14.2 kB view details)

Uploaded Jan 16, 2026 Python 3

File details

Details for the file alterlab-2.0.0.tar.gz.

File metadata

Download URL: alterlab-2.0.0.tar.gz
Upload date: Jan 16, 2026
Size: 15.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/5.15.153.1-microsoft-standard-WSL2

File hashes

Hashes for alterlab-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`7b189005eb89596d4614278c06f4cdecaf30bec09f53cecfae947c0de02f5865`
MD5	`8b71466fc789415716cb6e2e9d378c5a`
BLAKE2b-256	`375cdb2fc88a1a57d2f28932544976683869532da49b2f2895045f7d384e07fd`

See more details on using hashes here.

File details

Details for the file alterlab-2.0.0-py3-none-any.whl.

File metadata

Download URL: alterlab-2.0.0-py3-none-any.whl
Upload date: Jan 16, 2026
Size: 14.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/5.15.153.1-microsoft-standard-WSL2

File hashes

Hashes for alterlab-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`701e72e0d183c4f1f5945bdbdbd1a088ee3658edb5ba72faa62c2250fca94c63`
MD5	`aad69d2d707d66b5e52b21ef43135c11`
BLAKE2b-256	`3c0e02a12dd46eaaa95b63f28c3e9e45cc163278730c36ce67d3a89313337b85`

See more details on using hashes here.

alterlab 2.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AlterLab Python SDK

Features

Installation

Quick Start

Pricing

Usage Examples

Basic Scraping

Structured Extraction

Cost Controls

Advanced Options

BYOP (Bring Your Own Proxy)

Async Support

Caching

PDF and Image Extraction

Error Handling

Check Usage & Balance

API Reference

AlterLab Client

scrape() Method

ScrapeResult

Environment Variables

Requirements

Support

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes