ScrapeGraph Python SDK for API
Project description
🌐 ScrapeGraph Python SDK
Official Python SDK for the ScrapeGraph AI API - Smart web scraping powered by AI.
🚀 Features
- ✨ Smart web scraping with AI
- 🔄 Both sync and async clients
- 📊 Structured output with Pydantic schemas
- 🔍 Detailed logging with emojis
- ⚡ Automatic retries and error handling
- 🔐 Secure API authentication
📦 Installation
Using pip
pip install scrapegraph-py
Using Poetry (Recommended)
# Install poetry if you haven't already
pip install poetry
# Install dependencies
poetry install
# Install pre-commit hooks
poetry run pre-commit install
🔧 Quick Start
[!NOTE] If you prefer, you can use the environment variables to configure the API key and load them using
load_dotenv()
from scrapegraph_py import SyncClient
from scrapegraph_py.logger import get_logger
# Enable debug logging
logger = get_logger(level="DEBUG")
# Initialize client
client = SyncClient(api_key="sgai-your-api-key")
# Make a request
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)
print(response)
🎯 Examples
Async Usage
import asyncio
from scrapegraph_py import AsyncClient
async def main():
async with AsyncClient(api_key="sgai-your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading"
)
print(response)
asyncio.run(main())
With Output Schema
from pydantic import BaseModel, Field
from scrapegraph_py import SyncClient
class WebsiteData(BaseModel):
title: str = Field(description="The page title")
description: str = Field(description="The meta description")
client = SyncClient(api_key="sgai-your-api-key")
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the title and description",
output_schema=WebsiteData
)
📚 Documentation
For detailed documentation, visit docs.scrapegraphai.com
🛠️ Development
Setup
- Clone the repository:
git clone https://github.com/ScrapeGraphAI/scrapegraph-sdk.git
cd scrapegraph-sdk
- Install dependencies:
poetry install
- Install pre-commit hooks:
poetry run pre-commit install
Running Tests
# Run all tests
poetry run pytest
# Run with coverage
poetry run pytest --cov=scrapegraph_py
# Run specific test file
poetry run pytest tests/test_client.py
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
🔗 Links
💬 Support
- 📧 Email: support@scrapegraphai.com
- 💻 GitHub Issues: Create an issue
- 🌟 Feature Requests: Request a feature
Made with ❤️ by ScrapeGraph AI
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file scrapegraph_py-1.2.0.tar.gz
.
File metadata
- Download URL: scrapegraph_py-1.2.0.tar.gz
- Upload date:
- Size: 119.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 15951827075d295eb49fde7f385ba9db0d6eba0e1484215f1de980c0d97a7560 |
|
MD5 | 1a1669631372df519d73c6d71bdd4f72 |
|
BLAKE2b-256 | 237ba634acdfac0c9a9116e08dce57f1ba1ea55207d27c336762e733964a2144 |
File details
Details for the file scrapegraph_py-1.2.0-py3-none-any.whl
.
File metadata
- Download URL: scrapegraph_py-1.2.0-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d149bae1f0254ae3f76d8cf285a06fba297faaaabd554b80f0978d9e0507fe41 |
|
MD5 | 0b53338d34a168cff092f170c93cf398 |
|
BLAKE2b-256 | de994fca1d07adbeaff059ba2f0d887039b6f8a3c1cf8cc0ca987b398db813da |