Skip to main content

Python SDK for the MrScraper web-scraping API

Project description

MrScraper Python SDK

A clean, typed Python client for the MrScraper web-scraping API. Supports async/await usage.


Installation

pip install mrscraper-sdk

Requires Python 3.9+.


Authentication

Every client is initialised with your MrScraper API token. Get yours at https://app.mrscraper.com.

from mrscraper import MrScraper

client = MrScraper(token="atk_your_token_here")

Quick Start

Fetch raw HTML (stealth browser)

import asyncio
from mrscraper import MrScraper

async def main():
    client = MrScraper(token="atk_your_token_here")

    result = await client.fetch_html(
        "https://stockx.com/air-jordan-1-retro-low-og-chicago-2025",
        geo_code="US",
        timeout=120,
        block_resources=False,
    )
    print(result["data"])   # raw HTML string

asyncio.run(main())

Create an AI scraper

result = await client.create_scraper(
    url="https://example.com/products",
    message="Extract all product names, prices, and ratings",
    agent="listing",          # "general" | "listing" | "map"
    proxy_country="US",
)
scraper_id = result["data"]["data"]["id"]
print("Scraper ID:", scraper_id)

Rerun a scraper on a new URL

result = await client.rerun_scraper(
    scraper_id=scraper_id,
    url="https://example.com/products?page=2",
)

Bulk rerun on multiple URLs (AI scraper)

result = await client.bulk_rerun_scraper(
    scraper_id=scraper_id,
    urls=[
        "https://example.com/products/item1",
        "https://example.com/products/item2",
        "https://example.com/products/item3",
    ],
)

Rerun a manually configured scraper

result = await client.rerun_manual_scraper(
    scraper_id="manual_scraper_67890",
    url="https://example.com/products/new-item",
)

Bulk rerun manual scraper on multiple URLs

result = await client.bulk_rerun_manual_scraper(
    scraper_id="scraper_12345",
    urls=[
        "https://www.example.com/products/item1",
        "https://www.example.com/products/item2",
        "https://www.example.com/products/item3",
    ],
)

Retrieve results

# All results (paginated)
page = await client.get_all_results(
    sort_field="updatedAt",
    sort_order="DESC",
    page_size=20,
    page=1,
    search="product",
    date_range_column="updatedAt",
    start_at="2024-01-01",
    end_at="2024-01-31",
)
print(page["data"])

# A specific result by ID
result = await client.get_result_by_id("result_12345")
print(result["data"])

API Reference

MrScraper

All methods are coroutines and must be awaited.

Method Description
fetch_html(url, *, timeout, geo_code, block_resources) Fetch rendered HTML via the MrScraper stealth browser
create_scraper(url, message, *, agent, proxy_country, ...) Create & run an AI-powered scraper
rerun_scraper(scraper_id, url, *, max_depth, max_pages, limit, ...) Rerun an AI scraper on a new URL
bulk_rerun_scraper(scraper_id, urls) Rerun an AI scraper on multiple URLs in one batch
rerun_manual_scraper(scraper_id, url) Rerun a manually configured scraper on a single URL
bulk_rerun_manual_scraper(scraper_id, urls) Rerun a manual scraper on multiple URLs in one batch
get_all_results(*, sort_field, sort_order, page_size, page, search, ...) List all results with filtering & pagination
get_result_by_id(result_id) Fetch a single result by its ID

All methods return a dict with the following keys:

Key Type Description
status_code int HTTP status code
data Any Parsed JSON body or raw text
headers dict Response headers

bulk_rerun_manual_scraper

Reruns a manually configured scraper on multiple URLs simultaneously in a single batch operation. This is more efficient than calling rerun_manual_scraper multiple times, as it processes all URLs in parallel and returns consolidated results. Ideal for scraping multiple pages, products, or articles with the same extraction logic.

Argument Description
scraper_id The ID of the manual scraper to rerun (obtained from the MrScraper dashboard). Must be a scraper created manually through the web interface, not an AI scraper. Find it at https://app.mrscraper.com
urls A list of target URLs to scrape (required, must contain at least one URL). Each URL will be processed independently using the scraper's extraction logic. Example: ["https://example.com/page1", "https://example.com/page2"]

Returns: A dict with status_code, data (bulk job info including job ID, status, metadata; use get_all_results or get_result_by_id to fetch per-URL results), and headers.

Example:

result = await client.bulk_rerun_manual_scraper(
    scraper_id="scraper_12345",
    urls=[
        "https://www.example.com/products/item1",
        "https://www.example.com/products/item2",
        "https://www.example.com/products/item3",
    ],
)

create_scraper — agent types

Agent Best used for
"general" Default; handles almost any page
"listing" Product listings, job boards, search results
"map" Crawling all sub-pages / sitemaps of a site

The max_depth, max_pages, limit, include_patterns, and exclude_patterns parameters are only meaningful when agent="map".


Exceptions

Exception Raised when
MrScraperError Base class for all SDK errors
AuthenticationError API token is invalid or missing (HTTP 401)
APIError API returned a non-2xx error; has .status_code attribute
NetworkError Connection timeout or network-level failure
from mrscraper.exceptions import AuthenticationError, APIError, NetworkError

try:
    result = await client.fetch_html("https://example.com")
except AuthenticationError:
    print("Check your API token at https://app.mrscraper.com")
except APIError as e:
    print(f"API error {e.status_code}: {e}")
except NetworkError as e:
    print(f"Network problem: {e}")

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Lint & format
ruff check .
ruff format .

# Type check
mypy src/mrscraper

License

MIT © MrScraper

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mrscraper_sdk-0.1.0.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mrscraper_sdk-0.1.0-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file mrscraper_sdk-0.1.0.tar.gz.

File metadata

  • Download URL: mrscraper_sdk-0.1.0.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for mrscraper_sdk-0.1.0.tar.gz
Algorithm Hash digest
SHA256 572d9e9a23911fc824d5a0e140faea0804c04df22f54b2c5605e5818411a784a
MD5 0b9aa1aa7744e0bb53822c2955983946
BLAKE2b-256 a8ac0c613279a3c4eb41a697717d738b53442677bf4e7c91f4d330dda2241bd1

See more details on using hashes here.

File details

Details for the file mrscraper_sdk-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mrscraper_sdk-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for mrscraper_sdk-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dc5570172ecd0b231fe856d1d190829b610f0c47f2c1b82f61b4e3a8d49fc114
MD5 e560665303009d42a310efb08fddafc2
BLAKE2b-256 dd96bf915adade2f054678558e7c212f00826322771fa7f07f7a53d686681ef7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page