Official Python SDK for ScrapeGraph AI API

These details have not been verified by PyPI

Project links

Project description

ScrapeGraphAI Python SDK

Official Python SDK for the ScrapeGraphAI API.

Install

pip install scrapegraph-py
# or
uv add scrapegraph-py

Quick Start

from scrapegraph_py import ScrapeGraphAI, ScrapeRequest

# reads SGAI_API_KEY from env, or pass explicitly: ScrapeGraphAI(api_key="...")
sgai = ScrapeGraphAI()

result = sgai.scrape(ScrapeRequest(
    url="https://example.com",
))

if result.status == "success":
    print(result.data["results"]["markdown"]["data"])
else:
    print(result.error)

Every method returns ApiResult[T] — no exceptions to catch:

@dataclass
class ApiResult(Generic[T]):
    status: Literal["success", "error"]
    data: T | None
    error: str | None
    elapsed_ms: int

API

scrape

Scrape a webpage in multiple formats (markdown, html, screenshot, json, etc).

from scrapegraph_py import (
    ScrapeGraphAI, ScrapeRequest, FetchConfig,
    MarkdownFormatConfig, ScreenshotFormatConfig, JsonFormatConfig
)

sgai = ScrapeGraphAI()

res = sgai.scrape(ScrapeRequest(
    url="https://example.com",
    formats=[
        MarkdownFormatConfig(mode="reader"),
        ScreenshotFormatConfig(full_page=True, width=1440, height=900),
        JsonFormatConfig(prompt="Extract product info"),
    ],
    content_type="text/html",           # optional, auto-detected
    fetch_config=FetchConfig(           # optional
        mode="js",                      # "auto" | "fast" | "js"
        stealth=True,
        timeout=30000,
        wait=2000,
        scrolls=3,
        headers={"Accept-Language": "en"},
        cookies={"session": "abc"},
        country="us",
    ),
))

Formats:

markdown — Clean markdown (modes: normal, reader, prune)
html — Raw HTML (modes: normal, reader, prune)
links — All links on the page
images — All image URLs
summary — AI-generated summary
json — Structured extraction with prompt/schema
branding — Brand colors, typography, logos
screenshot — Page screenshot (full_page, width, height, quality)

extract

Extract structured data from a URL, HTML, or markdown using AI.

from scrapegraph_py import ScrapeGraphAI, ExtractRequest

sgai = ScrapeGraphAI()

res = sgai.extract(ExtractRequest(
    url="https://example.com",
    prompt="Extract product names and prices",
    schema={"type": "object", "properties": {...}},  # optional
    mode="reader",                                    # optional
    fetch_config=FetchConfig(...),                   # optional
))
# Or pass html/markdown directly instead of url

search

Search the web and optionally extract structured data.

from scrapegraph_py import ScrapeGraphAI, SearchRequest

sgai = ScrapeGraphAI()

res = sgai.search(SearchRequest(
    query="best programming languages 2024",
    num_results=5,                      # 1-20, default 3
    format="markdown",                  # "markdown" | "html"
    prompt="Extract key points",        # optional, for AI extraction
    schema={...},                       # optional
    time_range="past_week",             # optional
    location_geo_code="us",             # optional
    fetch_config=FetchConfig(...),      # optional
))

crawl

Crawl a website and its linked pages.

from scrapegraph_py import ScrapeGraphAI, CrawlRequest, MarkdownFormatConfig

sgai = ScrapeGraphAI()

# Start a crawl
start = sgai.crawl.start(CrawlRequest(
    url="https://example.com",
    formats=[MarkdownFormatConfig()],
    max_pages=50,
    max_depth=2,
    max_links_per_page=10,
    include_patterns=["/blog/*"],
    exclude_patterns=["/admin/*"],
    fetch_config=FetchConfig(...),
))

# Check status
status = sgai.crawl.get(start.data["id"])

# Control
sgai.crawl.stop(crawl_id)
sgai.crawl.resume(crawl_id)
sgai.crawl.delete(crawl_id)

monitor

Monitor a webpage for changes on a schedule.

from scrapegraph_py import ScrapeGraphAI, MonitorCreateRequest, MarkdownFormatConfig

sgai = ScrapeGraphAI()

# Create a monitor
mon = sgai.monitor.create(MonitorCreateRequest(
    url="https://example.com",
    name="Price Monitor",
    interval="0 * * * *",               # cron expression
    formats=[MarkdownFormatConfig()],
    webhook_url="https://...",          # optional
    fetch_config=FetchConfig(...),
))

# Manage monitors
sgai.monitor.list()
sgai.monitor.get(cron_id)
sgai.monitor.update(cron_id, MonitorUpdateRequest(interval="0 */6 * * *"))
sgai.monitor.pause(cron_id)
sgai.monitor.resume(cron_id)
sgai.monitor.delete(cron_id)

history

Fetch request history.

from scrapegraph_py import ScrapeGraphAI, HistoryFilter

sgai = ScrapeGraphAI()

history = sgai.history.list(HistoryFilter(
    service="scrape",                   # optional filter
    page=1,
    limit=20,
))

entry = sgai.history.get("request-id")

credits / health

from scrapegraph_py import ScrapeGraphAI

sgai = ScrapeGraphAI()

credits = sgai.credits()
# { remaining: 1000, used: 500, plan: "pro", jobs: { crawl: {...}, monitor: {...} } }

health = sgai.health()
# { status: "ok", uptime: 12345 }

Async Client

All methods have async equivalents via AsyncScrapeGraphAI:

import asyncio
from scrapegraph_py import AsyncScrapeGraphAI, ScrapeRequest

async def main():
    async with AsyncScrapeGraphAI() as sgai:
        result = await sgai.scrape(ScrapeRequest(url="https://example.com"))
        if result.status == "success":
            print(result.data["results"]["markdown"]["data"])
        else:
            print(result.error)

asyncio.run(main())

Async Extract

async with AsyncScrapeGraphAI() as sgai:
    res = await sgai.extract(ExtractRequest(
        url="https://example.com",
        prompt="Extract product names and prices",
    ))

Async Search

async with AsyncScrapeGraphAI() as sgai:
    res = await sgai.search(SearchRequest(
        query="best programming languages 2024",
        num_results=5,
    ))

Async Crawl

async with AsyncScrapeGraphAI() as sgai:
    start = await sgai.crawl.start(CrawlRequest(
        url="https://example.com",
        max_pages=50,
    ))
    status = await sgai.crawl.get(start.data["id"])

Async Monitor

async with AsyncScrapeGraphAI() as sgai:
    mon = await sgai.monitor.create(MonitorCreateRequest(
        url="https://example.com",
        name="Price Monitor",
        interval="0 * * * *",
    ))

Examples

Sync Examples

Service	Example	Description
scrape	`scrape_basic.py`	Basic markdown scraping
scrape	`scrape_multi_format.py`	Multiple formats
scrape	`scrape_json_extraction.py`	Structured JSON extraction
scrape	`scrape_pdf.py`	PDF document parsing
scrape	`scrape_with_fetchconfig.py`	JS rendering, stealth mode
extract	`extract_basic.py`	AI data extraction
extract	`extract_with_schema.py`	Extraction with JSON schema
search	`search_basic.py`	Web search
search	`search_with_extraction.py`	Search + AI extraction
crawl	`crawl_basic.py`	Start and monitor a crawl
crawl	`crawl_with_formats.py`	Crawl with formats
monitor	`monitor_basic.py`	Create a page monitor
monitor	`monitor_with_webhook.py`	Monitor with webhook
utilities	`credits.py`	Check credits and limits
utilities	`health.py`	API health check
utilities	`history.py`	Request history

Async Examples

Service	Example	Description
scrape	`scrape_basic_async.py`	Basic markdown scraping
scrape	`scrape_multi_format_async.py`	Multiple formats
scrape	`scrape_json_extraction_async.py`	Structured JSON extraction
scrape	`scrape_pdf_async.py`	PDF document parsing
scrape	`scrape_with_fetchconfig_async.py`	JS rendering, stealth mode
extract	`extract_basic_async.py`	AI data extraction
extract	`extract_with_schema_async.py`	Extraction with JSON schema
search	`search_basic_async.py`	Web search
search	`search_with_extraction_async.py`	Search + AI extraction
crawl	`crawl_basic_async.py`	Start and monitor a crawl
crawl	`crawl_with_formats_async.py`	Crawl with formats
monitor	`monitor_basic_async.py`	Create a page monitor
monitor	`monitor_with_webhook_async.py`	Monitor with webhook
utilities	`credits_async.py`	Check credits and limits
utilities	`health_async.py`	API health check
utilities	`history_async.py`	Request history

Environment Variables

Variable	Description	Default
`SGAI_API_KEY`	Your ScrapeGraphAI API key	—
`SGAI_API_URL`	Override API base URL	`https://v2-api.scrapegraphai.com/api`
`SGAI_DEBUG`	Enable debug logging (`"1"`)	off
`SGAI_TIMEOUT`	Request timeout in seconds	`120`

Development

uv sync
uv run pytest tests/              # unit tests
uv run pytest tests/test_integration.py  # live API tests (requires SGAI_API_KEY)
uv run ruff check .               # lint

License

MIT - ScrapeGraphAI

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.1.0

Apr 21, 2026

This version

2.0.1

Apr 21, 2026

1.47.0

Apr 18, 2026

1.46.0

Jan 26, 2026

1.45.0

Jan 23, 2026

1.44.1

Jan 17, 2026

1.44.0

Nov 28, 2025

1.43.0

Nov 26, 2025

1.42.0

Nov 21, 2025

1.41.1

Nov 14, 2025

1.41.0

Nov 4, 2025

1.40.0

Nov 4, 2025

1.39.0

Nov 3, 2025

1.38.0

Oct 23, 2025

1.37.0

Oct 23, 2025

1.36.0

Oct 16, 2025

1.35.0

Oct 15, 2025

1.34.0

Oct 8, 2025

1.33.0

Oct 6, 2025

1.32.0

Oct 6, 2025

1.31.0

Sep 17, 2025

1.30.0

Sep 17, 2025

1.29.0

Sep 16, 2025

1.28.0

Sep 16, 2025

1.27.0

Sep 14, 2025

1.26.0

Sep 11, 2025

1.25.1

Sep 8, 2025

1.25.0

Sep 8, 2025

1.24.0

Sep 3, 2025

1.23.0

Sep 1, 2025

1.22.0

Sep 1, 2025

1.21.0

Sep 1, 2025

1.20.0

Aug 19, 2025

1.19.0

Aug 18, 2025

1.18.2

Aug 6, 2025

1.18.1

Aug 6, 2025

1.18.0

Aug 5, 2025

1.17.0

Jul 30, 2025

1.16.0

Jul 21, 2025

1.15.0

Jul 18, 2025

1.14.2

Jul 12, 2025

1.14.1

Jul 8, 2025

1.14.0

Jul 8, 2025

1.12.2

Jul 8, 2025

1.12.1

Jul 8, 2025

1.12.0

Feb 5, 2025

1.11.0

Feb 3, 2025

1.11.0b1 pre-release

Feb 3, 2025

1.10.2

Jan 22, 2025

1.10.1

Jan 22, 2025

1.10.0

Jan 16, 2025

1.9.0

Jan 8, 2025

1.9.0b7 pre-release

Feb 3, 2025

1.9.0b6 pre-release

Jan 8, 2025

1.9.0b5 pre-release

Jan 3, 2025

1.9.0b3 pre-release

Dec 10, 2024

1.9.0b2 pre-release

Dec 10, 2024

1.9.0b1 pre-release

Dec 10, 2024

1.8.1

Jul 8, 2025

1.8.0

Dec 8, 2024

1.7.0

Dec 5, 2024

1.7.0b1 pre-release

Dec 5, 2024

1.6.0

Dec 5, 2024

1.6.0b1 pre-release

Dec 5, 2024

1.5.0

Dec 4, 2024

1.5.0b1 pre-release

Dec 5, 2024

1.4.3

Dec 3, 2024

1.4.3b3 pre-release

Dec 5, 2024

1.4.3b2 pre-release

Dec 5, 2024

1.4.3b1 pre-release

Dec 3, 2024

1.4.2

Dec 2, 2024

1.4.1

Dec 2, 2024

1.4.0

Nov 30, 2024

1.3.0

Nov 30, 2024

1.2.2

Nov 29, 2024

1.2.1

Nov 29, 2024

1.2.0

Nov 28, 2024

1.1.0

Nov 28, 2024

1.0.0

Jul 2, 2025

0.0.3

Nov 20, 2024

0.0.2

Nov 20, 2024

0.0.1

Nov 9, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapegraph_py-2.0.1.tar.gz (4.9 MB view details)

Uploaded Apr 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

scrapegraph_py-2.0.1-py3-none-any.whl (14.1 kB view details)

Uploaded Apr 21, 2026 Python 3

File details

Details for the file scrapegraph_py-2.0.1.tar.gz.

File metadata

Download URL: scrapegraph_py-2.0.1.tar.gz
Upload date: Apr 21, 2026
Size: 4.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.14

File hashes

Hashes for scrapegraph_py-2.0.1.tar.gz
Algorithm	Hash digest
SHA256	`69dbb2c4498b375e0098f02795f26b79386647c7640a2199098dd2bf7fa1b737`
MD5	`124e5b3870c244d3fed230e06f96dfc4`
BLAKE2b-256	`add6aefef15a267efc31a91e78465f924f6d16bcaae893996f730b9b22c41e02`

See more details on using hashes here.

File details

Details for the file scrapegraph_py-2.0.1-py3-none-any.whl.

File metadata

Download URL: scrapegraph_py-2.0.1-py3-none-any.whl
Upload date: Apr 21, 2026
Size: 14.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.14

File hashes

Hashes for scrapegraph_py-2.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0925ff7aa4f048d0d203a9fe5cd095a679b32cade6099c56840acdc1c9dd38dc`
MD5	`aa67e77273a1f3b1770302f59faf688a`
BLAKE2b-256	`c6f86b6943f8f690979f88ffb922544d8cab22e86e9ca01a24285357c2ef8969`

See more details on using hashes here.

scrapegraph-py 2.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ScrapeGraphAI Python SDK

Install

Quick Start

API

scrape

extract

search

crawl

monitor

history

credits / health

Async Client

Async Extract

Async Search

Async Crawl

Async Monitor

Examples

Sync Examples

Async Examples

Environment Variables

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes