ScrapeGraph Python SDK for API
Project description
🌐 ScrapeGraph Python SDK
Official Python SDK for the ScrapeGraph API - Smart web scraping powered by AI.
📦 Installation
pip install scrapegraph-py
🚀 Features
- 🤖 AI-powered web scraping
- 🔄 Both sync and async clients
- 📊 Structured output with Pydantic schemas
- 🔍 Detailed logging
- ⚡ Automatic retries
- 🔐 Secure authentication
🎯 Quick Start
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
[!NOTE] You can set the
SGAI_API_KEY
environment variable and initialize the client without parameters:client = Client()
📚 Available Endpoints
🔍 SmartScraper
Scrapes any webpage using AI to extract specific information.
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
# Basic usage
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)
print(response)
Output Schema (Optional)
from pydantic import BaseModel, Field
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
class WebsiteData(BaseModel):
title: str = Field(description="The page title")
description: str = Field(description="The meta description")
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the title and description",
output_schema=WebsiteData
)
📝 Markdownify
Converts any webpage into clean, formatted markdown.
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
response = client.markdownify(
website_url="https://example.com"
)
print(response)
💻 LocalScraper
Extracts information from HTML content using AI.
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: contact@example.com</p>
</div>
</body>
</html>
"""
response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)
print(response)
⚡ Async Support
All endpoints support async operations:
import asyncio
from scrapegraph_py import AsyncClient
async def main():
async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)
asyncio.run(main())
📖 Documentation
For detailed documentation, visit scrapegraphai.com/docs
🛠️ Development
For information about setting up the development environment and contributing to the project, see our Contributing Guide.
💬 Support & Feedback
- 📧 Email: support@scrapegraphai.com
- 💻 GitHub Issues: Create an issue
- 🌟 Feature Requests: Request a feature
- ⭐ API Feedback: You can also submit feedback programmatically using the feedback endpoint:
from scrapegraph_py import Client client = Client(api_key="your-api-key-here") client.submit_feedback( request_id="your-request-id", rating=5, feedback_text="Great results!" )
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🔗 Links
Made with ❤️ by ScrapeGraph AI
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file scrapegraph_py-1.7.0.tar.gz
.
File metadata
- Download URL: scrapegraph_py-1.7.0.tar.gz
- Upload date:
- Size: 107.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e5af2d5af600511d6eb615f720e22b12683f6fa20a010890729813b436f7a35 |
|
MD5 | d7f04d3aff54c14d2fc586123aa5080a |
|
BLAKE2b-256 | e6a26a36e87619d1113827f424e1a0434a105d97d0eeab9441a6bf572abf5d67 |
File details
Details for the file scrapegraph_py-1.7.0-py3-none-any.whl
.
File metadata
- Download URL: scrapegraph_py-1.7.0-py3-none-any.whl
- Upload date:
- Size: 14.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25932c3527a09eac359fdaaab28036bb823553fcbb5779d9c8fe98427261ef3a |
|
MD5 | b9498bb69e8fc42248ffa67fb39d84d8 |
|
BLAKE2b-256 | ab6ae8b12303625a0645f599d39b393526e7d3552948e80cb58a33bd22d387b8 |