Skip to main content

llama-index tools integrating ScrapegraphAI

Project description

LlamaIndex Tool - Scrapegraph

This tool integrates Scrapegraph with LlamaIndex, providing intelligent web scraping capabilities with structured data extraction.

Installation

pip install llama-index-tools-scrapegraph

Usage

First, import and initialize the ScrapegraphToolSpec:

from llama_index.tools.scrapegraph import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions

The tool provides the following capabilities:

  1. Smart Scraper
from pydantic import BaseModel


# Define your schema (optional)
class ProductSchema(BaseModel):
    name: str
    price: float
    description: str


schema = [ProductSchema]

# Perform the scraping
result = scrapegraph_tool.scrapegraph_smartscraper(
    prompt="Extract product information",
    url="https://example.com/product",
    api_key="your-api-key",
    schema=schema,  # Optional
)
  1. Markdownify

Convert webpage content to markdown format:

markdown_content = scrapegraph_tool.scrapegraph_markdownify(
    url="https://example.com", api_key="your-api-key"
)
  1. Local Scrape

Extract structured data from raw text:

text = """
Your raw text content here...
"""

structured_data = scrapegraph_tool.scrapegraph_local_scrape(
    text=text, api_key="your-api-key"
)

Requirements

  • Python 3.8+
  • scrapegraph-py package
  • Valid Scrapegraph API key

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_tools_scrapegraphai-0.1.0.tar.gz (2.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_tools_scrapegraphai-0.1.0.tar.gz.

File metadata

File hashes

Hashes for llama_index_tools_scrapegraphai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c92911cafbf0c23d1a4ea209927152403eb9c79ecba4f76a43821702b20fd2e6
MD5 c8f2f5449fa4f4ca615a1e1d35230b5a
BLAKE2b-256 e69db2c1310d1bda54fcb3f8f15ec7e728d84016e9a739e120253f39fcc2a329

See more details on using hashes here.

File details

Details for the file llama_index_tools_scrapegraphai-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_tools_scrapegraphai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a8780b41997d6728d008455ead073bb8491df2e7e00c362f6c88c104f47e7d33
MD5 5ad4290dc7b82be9109f0cfd26b6794c
BLAKE2b-256 13f7322bab23d22fc61b173ac7568b51cf306a647478d030cd63f488ba57b450

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page