Skip to main content

llama-index tools integrating ScrapegraphAI

Project description

LlamaIndex Tool - Scrapegraph

This tool integrates Scrapegraph with LlamaIndex, providing intelligent web scraping capabilities with structured data extraction.

Installation

pip install llama-index-tools-scrapegraph

Usage

First, import and initialize the ScrapegraphToolSpec:

from llama_index.tools.scrapegraph import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions

The tool provides the following capabilities:

  1. Smart Scraper
from pydantic import BaseModel


# Define your schema (optional)
class ProductSchema(BaseModel):
    name: str
    price: float
    description: str


schema = [ProductSchema]

# Perform the scraping
result = scrapegraph_tool.scrapegraph_smartscraper(
    prompt="Extract product information",
    url="https://example.com/product",
    api_key="your-api-key",
    schema=schema,  # Optional
)
  1. Markdownify

Convert webpage content to markdown format:

markdown_content = scrapegraph_tool.scrapegraph_markdownify(
    url="https://example.com", api_key="your-api-key"
)
  1. Local Scrape

Extract structured data from raw text:

text = """
Your raw text content here...
"""

structured_data = scrapegraph_tool.scrapegraph_local_scrape(
    text=text, api_key="your-api-key"
)

Requirements

  • Python 3.8+
  • scrapegraph-py package
  • Valid Scrapegraph API key

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_tools_scrapegraphai-0.2.0.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_tools_scrapegraphai-0.2.0.tar.gz.

File metadata

File hashes

Hashes for llama_index_tools_scrapegraphai-0.2.0.tar.gz
Algorithm Hash digest
SHA256 48de15f4cfda8904461f40952ad8e68cbbb4c5955ae14858626a6db272acf3f2
MD5 01ae250c7604c9151085d582c753c380
BLAKE2b-256 3620cb3231a6aa7db08fcd1acfcee19eca5b694ff81ab45bd65c014c869a8117

See more details on using hashes here.

File details

Details for the file llama_index_tools_scrapegraphai-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_tools_scrapegraphai-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ce1d079b93efbe1bcbb8c76aedf8cb44d4274b70c45a2086d9d7ec9737787a43
MD5 e621f780d3b2f1b5d600acecb650eb9c
BLAKE2b-256 532a5b06def81c786de75273ffa25cff41b0bc144b94e9687848775ef251eda5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page