ScrapeGraph Python SDK for API
Project description
ScrapeGraph Python SDK
The official Python SDK for interacting with the ScrapeGraph AI API - a powerful web scraping and data extraction service.
Installation
Install the package using pip:
pip install scrapegraph-py
Authentication
To use the ScrapeGraph API, you'll need an API key. You can manage this in two ways:
- Environment variable:
export SCRAPEGRAPH_API_KEY="your-api-key-here"
.env
file:
SCRAPEGRAPH_API_KEY="your-api-key-here"
Features
The SDK provides four main functionalities:
- Web Scraping (basic and structured)
- Credits checking
- Feedback submission
- API status checking
Usage
Basic Web Scraping
from scrapegraph_py import ScrapeGraphClient, scrape
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv("SCRAPEGRAPH_API_KEY")
client = ScrapeGraphClient(api_key)
url = "https://scrapegraphai.com/"
prompt = "What does the company do?"
result = scrape(client, url, prompt)
print(result)
Local HTML Scraping
You can also scrape content from local HTML files:
from scrapegraph_py import ScrapeGraphClient, scrape_text
from bs4 import BeautifulSoup
def scrape_local_html(client: ScrapeGraphClient, file_path: str, prompt: str):
with open(file_path, 'r', encoding='utf-8') as file:
html_content = file.read()
# Use BeautifulSoup to extract text content
soup = BeautifulSoup(html_content, 'html.parser')
text_content = soup.get_text(separator='\n', strip=True)
# Use ScrapeGraph AI to analyze the text
return scrape_text(client, text_content, prompt)
# Usage
client = ScrapeGraphClient(api_key)
result = scrape_local_html(
client,
'sample.html',
"Extract main content and important information"
)
print("Extracted Data:", result)
Structured Data Extraction
For more structured data extraction, you can define a Pydantic schema:
from pydantic import BaseModel, Field
from scrapegraph_py import scrape
class CompanyInfoSchema(BaseModel):
company_name: str = Field(description="The name of the company")
description: str = Field(description="A description of the company")
main_products: list[str] = Field(description="The main products of the company")
# Scrape with schema
result = scrape(
api_key=api_key,
url="https://scrapegraphai.com/",
prompt="What does the company do?",
schema=CompanyInfoSchema
)
print(result)
Check Credits
Monitor your API usage:
from scrapegraph_py import credits
response = credits(api_key)
print(response)
Provide Feedback and Check Status
You can provide feedback on scraping results and check the API status:
from scrapegraph_py import feedback, status
# Check API status
status_response = status(api_key)
print(f"API Status: {status_response}")
# Submit feedback
feedback_response = feedback(
api_key=api_key,
request_id="your-request-id", # UUID from your scraping request
rating=5, # Rating from 1-5
message="Great results!"
)
print(f"Feedback Response: {feedback_response}")
Development
Requirements
- Python 3.9+
- Rye for dependency management (optional)
Project Structure
scrapegraph_py/
├── __init__.py
├── credits.py # Credits checking functionality
├── scrape.py # Core scraping functionality
└── feedback.py # Feedback submission functionality
examples/
├── credits_example.py
├── feedback_example.py
├── scrape_example.py
└── scrape_schema_example.py
tests/
├── test_credits.py
├── test_feedback.py
└── test_scrape.py
Setting up the Development Environment
- Clone the repository:
git clone https://github.com/yourusername/scrapegraph-py.git
cd scrapegraph-py
- Install dependencies:
# If using Rye
rye sync
# If using pip
pip install -r requirements-dev.lock
- Create a
.env
file in the root directory:
SCRAPEGRAPH_API_KEY="your-api-key-here"
License
This project is licensed under the MIT License.
Support
For support:
- Visit ScrapeGraph AI
- Contact our support team
- Check the examples in the
examples/
directory
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file scrapegraph_py-0.0.2.tar.gz
.
File metadata
- Download URL: scrapegraph_py-0.0.2.tar.gz
- Upload date:
- Size: 12.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a01957fffab7da6e41dac2a03177675456e872d60e9f58538a2d7acf3a691204 |
|
MD5 | f107f2e742f9f702c3461c7b047479c8 |
|
BLAKE2b-256 | 9e9b6b5973199079555df2238d8013b8b29495ab9fe88e002d4c6bb23abe015b |
File details
Details for the file scrapegraph_py-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: scrapegraph_py-0.0.2-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ce94281b45a7e273d674efef6f426f4083cf264c5a9c69e7a7b530f59d4760a9 |
|
MD5 | c56460bfd56a64fd9cff7c9d84c6f5a0 |
|
BLAKE2b-256 | c34eae548202cd7072bcf2938b34c506f18ce15258afdd619a14a701e01d5717 |