Skip to main content

Shared runtime and schemas for Geonode Scraper framework tools

Project description

Geonode Scraper Tools Core

Shared runtime, schemas, and operation registry for Geonode Scraper tool integrations.

Most users should install one of the framework packages instead:

  • geonode-scraper-langchain
  • geonode-scraper-crewai

Install the core package directly only if you are building your own wrapper layer on top of the shared service.

Installation

pip install geonode-scraper-tools-core

Public API

  • ScraperToolSettings
  • ScraperToolService
  • OperationSpec
  • OPERATIONS
  • get_operations()

Configuration

from geonode_scraper_tools_core import ScraperToolSettings, ScraperToolService

settings = ScraperToolSettings(
    host="https://api.example.com",
    api_key="your-api-key",
)

service = ScraperToolService(settings)

Exposed Operations

The shared service normalizes SDK responses into JSON-friendly dictionaries and exposes the following 17 operations:

Extraction

  • extract — extract content from a single URL (sync or async)
  • get_job_result — fetch the current state or result of an async extraction job
  • wait_for_job — poll an async extraction job until it reaches a terminal state
  • list_jobs — list previously submitted extraction jobs with optional filters

Batch

  • create_batch — submit a list of URLs for asynchronous batch extraction
  • get_batch_status — poll the current status and partial results of a batch job
  • wait_for_batch — poll a batch job until it reaches a terminal state
  • list_batch_jobs — list previously submitted batch jobs with optional filters

Crawl

  • create_crawl — start a crawl job from a seed URL
  • get_crawl_status — poll the current status and results of a crawl job
  • wait_for_crawl — poll a crawl job until it reaches a terminal state
  • list_crawl_jobs — list previously submitted crawl jobs with optional filters

Map

  • map_urls — discover all URLs under a base URL via sitemap and HTML link extraction
  • list_map_jobs — list previously submitted map jobs with optional filters
  • get_map_job — fetch the status and discovered URLs for a single map job

Statistics & Health

  • get_statistics — retrieve aggregated extraction statistics
  • health_check — check the scraper service health and version

Selecting a Subset of Operations

Pass an operations list to get_operations() or to any framework wrapper to expose only the operations you need.

from geonode_scraper_tools_core import get_operations

ops = get_operations(["extract", "map_urls", "create_batch", "wait_for_batch"])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geonode_scraper_tools_core-0.3.1.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geonode_scraper_tools_core-0.3.1-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file geonode_scraper_tools_core-0.3.1.tar.gz.

File metadata

File hashes

Hashes for geonode_scraper_tools_core-0.3.1.tar.gz
Algorithm Hash digest
SHA256 9f9cc7275f0bd46db431083efd36debdf676cff1b34670e2427fb2acebd81817
MD5 2e94e557130b9a2550b048c20fa8aa05
BLAKE2b-256 b411905619f49755f03de98f1b178d29678b138bb7a4e1fa673cada8561f4862

See more details on using hashes here.

Provenance

The following attestation bundles were made for geonode_scraper_tools_core-0.3.1.tar.gz:

Publisher: python-agent-tools-core-publish.yml on geonodecom/scraper-api-sdks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file geonode_scraper_tools_core-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for geonode_scraper_tools_core-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fa88608f546f1009c9383f16389cef1bbdfc6f6f501e4d3d39643fb5026e3f8b
MD5 e1808fde5cfe1c7e4feb46d92f1bafda
BLAKE2b-256 1f324382b12cd9e6517feb1ac0ac9c702d37121466fbc6307252ce78ac56468f

See more details on using hashes here.

Provenance

The following attestation bundles were made for geonode_scraper_tools_core-0.3.1-py3-none-any.whl:

Publisher: python-agent-tools-core-publish.yml on geonodecom/scraper-api-sdks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page