Skip to main content

ScrapeGraph Python SDK for API

Project description

🌐 ScrapeGraph Python SDK

PyPI version Python Support License Code style: black Documentation Status

Official Python SDK for the ScrapeGraph API - Smart web scraping powered by AI.

📦 Installation

pip install scrapegraph-py

🚀 Features

  • 🤖 AI-powered web scraping
  • 🔄 Both sync and async clients
  • 📊 Structured output with Pydantic schemas
  • 🔍 Detailed logging
  • ⚡ Automatic retries
  • 🔐 Secure authentication

🎯 Quick Start

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

[!NOTE] You can set the SGAI_API_KEY environment variable and initialize the client without parameters: client = Client()

📚 Available Endpoints

🔍 SmartScraper

Scrapes any webpage using AI to extract specific information.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

# Basic usage
response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the main heading and description"
)

print(response)
Output Schema (Optional)
from pydantic import BaseModel, Field
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

class WebsiteData(BaseModel):
    title: str = Field(description="The page title")
    description: str = Field(description="The meta description")

response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the title and description",
    output_schema=WebsiteData
)

📝 Markdownify

Converts any webpage into clean, formatted markdown.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

response = client.markdownify(
    website_url="https://example.com"
)

print(response)

💻 LocalScraper

Extracts information from HTML content using AI.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

html_content = """
<html>
    <body>
        <h1>Company Name</h1>
        <p>We are a technology company focused on AI solutions.</p>
        <div class="contact">
            <p>Email: contact@example.com</p>
        </div>
    </body>
</html>
"""

response = client.localscraper(
    user_prompt="Extract the company description",
    website_html=html_content
)

print(response)

⚡ Async Support

All endpoints support async operations:

import asyncio
from scrapegraph_py import AsyncClient

async def main():
    async with AsyncClient() as client:
        response = await client.smartscraper(
            website_url="https://example.com",
            user_prompt="Extract the main content"
        )
        print(response)

asyncio.run(main())

📖 Documentation

For detailed documentation, visit scrapegraphai.com/docs

🛠️ Development

For information about setting up the development environment and contributing to the project, see our Contributing Guide.

💬 Support & Feedback

  • 📧 Email: support@scrapegraphai.com
  • 💻 GitHub Issues: Create an issue
  • 🌟 Feature Requests: Request a feature
  • ⭐ API Feedback: You can also submit feedback programmatically using the feedback endpoint:
    from scrapegraph_py import Client
    
    client = Client(api_key="your-api-key-here")
    
    client.submit_feedback(
        request_id="your-request-id",
        rating=5,
        feedback_text="Great results!"
    )
    

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Links


Made with ❤️ by ScrapeGraph AI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapegraph_py-1.9.0b2.tar.gz (124.3 kB view details)

Uploaded Source

Built Distribution

scrapegraph_py-1.9.0b2-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file scrapegraph_py-1.9.0b2.tar.gz.

File metadata

  • Download URL: scrapegraph_py-1.9.0b2.tar.gz
  • Upload date:
  • Size: 124.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for scrapegraph_py-1.9.0b2.tar.gz
Algorithm Hash digest
SHA256 b49563c97bd241494986c1903996064bb5fe68ea9b9e8c924f3c1e033a412396
MD5 11df4c33b1aee745a61c46aa65ca4429
BLAKE2b-256 29493a5b2f5477201f03947f0c71ef304d051d7b7ace494a2db9b588dbe03118

See more details on using hashes here.

File details

Details for the file scrapegraph_py-1.9.0b2-py3-none-any.whl.

File metadata

File hashes

Hashes for scrapegraph_py-1.9.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 2b1b73010e135fc0a65620bb190ec31aec1f627114adceba7d23d510882a52bd
MD5 2552d1d39cb92c368c9d29071b211914
BLAKE2b-256 a50db945939849314e875fe14f71d39aa54578f27d80a045f27aca725c630a80

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page