Skip to main content

ScrapeGraph Python SDK for API

Project description

🌐 ScrapeGraph Python SDK

PyPI version Python Support License Code style: black Documentation Status

Official Python SDK for the ScrapeGraph API - Smart web scraping powered by AI.

📦 Installation

pip install scrapegraph-py

🚀 Features

  • 🤖 AI-powered web scraping
  • 🔄 Both sync and async clients
  • 📊 Structured output with Pydantic schemas
  • 🔍 Detailed logging
  • ⚡ Automatic retries
  • 🔐 Secure authentication

🎯 Quick Start

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

[!NOTE] You can set the SGAI_API_KEY environment variable and initialize the client without parameters: client = Client()

📚 Available Endpoints

🔍 SmartScraper

Scrapes any webpage using AI to extract specific information.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

# Basic usage
response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the main heading and description"
)

print(response)
Output Schema (Optional)
from pydantic import BaseModel, Field
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

class WebsiteData(BaseModel):
    title: str = Field(description="The page title")
    description: str = Field(description="The meta description")

response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the title and description",
    output_schema=WebsiteData
)

📝 Markdownify

Converts any webpage into clean, formatted markdown.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

response = client.markdownify(
    website_url="https://example.com"
)

print(response)

💻 LocalScraper

Extracts information from HTML content using AI.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

html_content = """
<html>
    <body>
        <h1>Company Name</h1>
        <p>We are a technology company focused on AI solutions.</p>
        <div class="contact">
            <p>Email: contact@example.com</p>
        </div>
    </body>
</html>
"""

response = client.localscraper(
    user_prompt="Extract the company description",
    website_html=html_content
)

print(response)

⚡ Async Support

All endpoints support async operations:

import asyncio
from scrapegraph_py import AsyncClient

async def main():
    async with AsyncClient() as client:
        response = await client.smartscraper(
            website_url="https://example.com",
            user_prompt="Extract the main content"
        )
        print(response)

asyncio.run(main())

📖 Documentation

For detailed documentation, visit scrapegraphai.com/docs

🛠️ Development

For information about setting up the development environment and contributing to the project, see our Contributing Guide.

💬 Support & Feedback

  • 📧 Email: support@scrapegraphai.com
  • 💻 GitHub Issues: Create an issue
  • 🌟 Feature Requests: Request a feature
  • ⭐ API Feedback: You can also submit feedback programmatically using the feedback endpoint:
    from scrapegraph_py import Client
    
    client = Client(api_key="your-api-key-here")
    
    client.submit_feedback(
        request_id="your-request-id",
        rating=5,
        feedback_text="Great results!"
    )
    

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Links


Made with ❤️ by ScrapeGraph AI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapegraph_py-1.9.0b5.tar.gz (108.7 kB view details)

Uploaded Source

Built Distribution

scrapegraph_py-1.9.0b5-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file scrapegraph_py-1.9.0b5.tar.gz.

File metadata

  • Download URL: scrapegraph_py-1.9.0b5.tar.gz
  • Upload date:
  • Size: 108.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for scrapegraph_py-1.9.0b5.tar.gz
Algorithm Hash digest
SHA256 2c5692fdfcbbddc4a2d4ea44d058ed00891378f117039f60767c2b1d30e3ef35
MD5 d0ccc6f06b3e9fd2a79040a76e2aba39
BLAKE2b-256 2717a877fec145d358e16e54f1e1321b1085cf7d3669178a7f140950bc347e52

See more details on using hashes here.

File details

Details for the file scrapegraph_py-1.9.0b5-py3-none-any.whl.

File metadata

File hashes

Hashes for scrapegraph_py-1.9.0b5-py3-none-any.whl
Algorithm Hash digest
SHA256 c32bb3728ac635a94afc17d06d25d37041063c6932b6f44098d5f9d954af1c3e
MD5 180db035a62e73c57a66cf13e257aa25
BLAKE2b-256 d14921dd439cf5a231cfdf6fdab9fed79a6cdc22e6fd477b67315a0e19303c13

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page