Shared runtime and schemas for Geonode Scraper framework tools
Project description
Geonode Scraper Tools Core
Shared runtime, schemas, and operation registry for Geonode Scraper tool integrations.
Most users should install one of the framework packages instead:
geonode-scraper-langchaingeonode-scraper-crewai
Install the core package directly only if you are building your own wrapper layer on top of the shared service.
Installation
pip install geonode-scraper-tools-core
Public API
ScraperToolSettingsScraperToolServiceOperationSpecOPERATIONSget_operations()
Configuration
from geonode_scraper_tools_core import ScraperToolSettings, ScraperToolService
settings = ScraperToolSettings(
host="https://api.example.com",
api_key="your-api-key",
)
service = ScraperToolService(settings)
Exposed Operations
The shared service normalizes SDK responses into JSON-friendly dictionaries and exposes the following 13 operations:
Extraction
extract— extract content from a single URL (sync or async)get_job_result— fetch the current state or result of an async extraction jobwait_for_job— poll an async extraction job until it reaches a terminal statelist_jobs— list previously submitted extraction jobs with optional filters
Batch
create_batch— submit a list of URLs for asynchronous batch extractionget_batch_status— poll the current status and partial results of a batch jobwait_for_batch— poll a batch job until it reaches a terminal state
Crawl
create_crawl— start a crawl job from a seed URLget_crawl_status— poll the current status and results of a crawl jobwait_for_crawl— poll a crawl job until it reaches a terminal state
Map
map_urls— discover all URLs under a base URL via sitemap and HTML link extraction
Statistics & Health
get_statistics— retrieve aggregated extraction statisticshealth_check— check the scraper service health and version
Selecting a Subset of Operations
Pass an operations list to get_operations() or to any framework wrapper to
expose only the operations you need.
from geonode_scraper_tools_core import get_operations
ops = get_operations(["extract", "map_urls", "create_batch", "wait_for_batch"])
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file geonode_scraper_tools_core-0.2.0.tar.gz.
File metadata
- Download URL: geonode_scraper_tools_core-0.2.0.tar.gz
- Upload date:
- Size: 11.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a81ad18816d3867ce7fd1826bad0efdb605702cefd0c0a34dba848d020054dd6
|
|
| MD5 |
4bab3f0f4df936cdebb054ddafba6f5f
|
|
| BLAKE2b-256 |
f2d6e39a9e3905db0f1d9a2e49ec32b6f3c96c08a0a4a2b86d485169e403237a
|
Provenance
The following attestation bundles were made for geonode_scraper_tools_core-0.2.0.tar.gz:
Publisher:
python-agent-tools-core-publish.yml on geonodecom/scraper-api-sdks
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
geonode_scraper_tools_core-0.2.0.tar.gz -
Subject digest:
a81ad18816d3867ce7fd1826bad0efdb605702cefd0c0a34dba848d020054dd6 - Sigstore transparency entry: 1732221255
- Sigstore integration time:
-
Permalink:
geonodecom/scraper-api-sdks@98a8eb610cd6a525fccb7dfecc6a768463f03e83 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/geonodecom
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-agent-tools-core-publish.yml@98a8eb610cd6a525fccb7dfecc6a768463f03e83 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file geonode_scraper_tools_core-0.2.0-py3-none-any.whl.
File metadata
- Download URL: geonode_scraper_tools_core-0.2.0-py3-none-any.whl
- Upload date:
- Size: 10.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
130b5f5a52807a18fa67de034fd91c300d7a99d76ad892a7b5f1a14f9ce4a5b5
|
|
| MD5 |
76713adceccce4fc897590a307b5a1a2
|
|
| BLAKE2b-256 |
d4c7703aa54be8c99dba51173ad4fe370fb9ee0e709cb59e738aea76d182efd8
|
Provenance
The following attestation bundles were made for geonode_scraper_tools_core-0.2.0-py3-none-any.whl:
Publisher:
python-agent-tools-core-publish.yml on geonodecom/scraper-api-sdks
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
geonode_scraper_tools_core-0.2.0-py3-none-any.whl -
Subject digest:
130b5f5a52807a18fa67de034fd91c300d7a99d76ad892a7b5f1a14f9ce4a5b5 - Sigstore transparency entry: 1732221290
- Sigstore integration time:
-
Permalink:
geonodecom/scraper-api-sdks@98a8eb610cd6a525fccb7dfecc6a768463f03e83 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/geonodecom
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-agent-tools-core-publish.yml@98a8eb610cd6a525fccb7dfecc6a768463f03e83 -
Trigger Event:
workflow_dispatch
-
Statement type: