Skip to main content

llama-index tools integrating ScrapegraphAI

Project description

LlamaIndex Tool - Scrapegraph

This tool integrates Scrapegraph with LlamaIndex, providing intelligent web scraping capabilities with structured data extraction.

Installation

pip install llama-index-tools-scrapegraph

Usage

First, import and initialize the ScrapegraphToolSpec:

from llama_index.tools.scrapegraph import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions

The tool provides the following capabilities:

  1. Smart Scraper
from pydantic import BaseModel


# Define your schema (optional)
class ProductSchema(BaseModel):
    name: str
    price: float
    description: str


schema = [ProductSchema]

# Perform the scraping
result = scrapegraph_tool.scrapegraph_smartscraper(
    prompt="Extract product information",
    url="https://example.com/product",
    api_key="your-api-key",
    schema=schema,  # Optional
)
  1. Markdownify

Convert webpage content to markdown format:

markdown_content = scrapegraph_tool.scrapegraph_markdownify(
    url="https://example.com", api_key="your-api-key"
)
  1. Local Scrape

Extract structured data from raw text:

text = """
Your raw text content here...
"""

structured_data = scrapegraph_tool.scrapegraph_local_scrape(
    text=text, api_key="your-api-key"
)

Requirements

  • Python 3.8+
  • scrapegraph-py package
  • Valid Scrapegraph API key

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_tools_scrapegraphai-0.2.2.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_tools_scrapegraphai-0.2.2.tar.gz.

File metadata

File hashes

Hashes for llama_index_tools_scrapegraphai-0.2.2.tar.gz
Algorithm Hash digest
SHA256 4465b6d338f388ba41ee037039478ec43017ea81833ade724f7f81e5170714dd
MD5 8c94877ca1f1514817e738dffbe4eb20
BLAKE2b-256 3d560e90c2bbe3033429352a6a2793aabbdcaaf49f76dfc22fde4c3f7442cadb

See more details on using hashes here.

File details

Details for the file llama_index_tools_scrapegraphai-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_tools_scrapegraphai-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1317a189b47862b51eeeaf7b6e06e342b801c6c245d6b3217c0f26e8ee5b456f
MD5 6a3df0320a3d15373c7582c74fc7b685
BLAKE2b-256 3ccf9516c129ec24f3e928cc32afe50f1e0ca909f340b024513656b4988b9486

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page