Python SDK for the Geonode Scraper API

These details have not been verified by PyPI

Project description

Geonode Scraper SDK

Python SDK for the Geonode Scraper API. It supports synchronous and asynchronous content extraction, job polling, usage statistics, and service health checks.

Requirements

Python 3.10+

Installation

pip install geonode-scraper-sdk

Configuration And Authentication

Create a client configuration with your API base URL and API key.

from geonode_scraper_sdk import Configuration

configuration = Configuration(
    host="https://api.example.com",
    api_key={"APIKeyHeader": "your-api-key"},
)

If you do not set host, the generated client defaults to http://localhost. You normally do not need api_key_prefix for this API.

Quick Start

This example performs a synchronous extraction and prints the markdown result.

from geonode_scraper_sdk import (
    ApiClient,
    ApiException,
    Configuration,
    ExtractRequest,
    ExtractionApi,
    OutputFormat,
    ProcessingMode,
)

configuration = Configuration(
    host="https://api.example.com",
    api_key={"APIKeyHeader": "your-api-key"},
)

with ApiClient(configuration) as api_client:
    api = ExtractionApi(api_client)

    try:
        response = api.extract_v1_extract_post(
            ExtractRequest(
                url="https://example.com",
                formats=[OutputFormat.MARKDOWN],
                processing_mode=ProcessingMode.SYNC,
            )
        )
        print(response.data.markdown)
        print(response.tokens_charged)
    except ApiException as exc:
        print(exc.status)
        print(exc.body)

Async Workflow

When processing_mode=ProcessingMode.ASYNC, the extract call returns an async job response with a job ID and status URL.

from geonode_scraper_sdk import ApiClient, Configuration, ExtractRequest, ExtractionApi, ProcessingMode

configuration = Configuration(
    host="https://api.example.com",
    api_key={"APIKeyHeader": "your-api-key"},
)

with ApiClient(configuration) as api_client:
    api = ExtractionApi(api_client)

    submit = api.extract_v1_extract_post(
        ExtractRequest(
            url="https://example.com",
            processing_mode=ProcessingMode.ASYNC,
        )
    )

    job = api.get_job_result_v1_extract_job_id_get(submit.job_id)
    print(job.status)
    if job.data and job.data.markdown:
        print(job.data.markdown)

Use get_job_result_v1_extract_job_id_get(job_id) to poll a single job, or list_jobs_v1_extract_jobs_get(...) to inspect and filter job history.

Error Handling

Non-2xx responses raise ApiException or one of its subclasses. The exception includes the HTTP status, response body, and any deserialized error model in exc.data.

from geonode_scraper_sdk import ApiClient, ApiException, Configuration, ExtractionApi, ExtractRequest

configuration = Configuration(
    host="https://api.example.com",
    api_key={"APIKeyHeader": "your-api-key"},
)

with ApiClient(configuration) as api_client:
    api = ExtractionApi(api_client)

    try:
        api.extract_v1_extract_post(ExtractRequest(url="https://example.com"))
    except ApiException as exc:
        print(exc.status)
        print(exc.body)
        print(exc.data)

Request Options

ExtractRequest supports the main extraction controls:

formats: output formats to return; defaults to [OutputFormat.HTML]
render_js: use a headless browser for JavaScript-rendered pages; defaults to False
processing_mode: ProcessingMode.SYNC or ProcessingMode.ASYNC; defaults to sync
proxy: optional ProxySettings for country and proxy type selection
headers: optional request headers dictionary

Example with additional options:

from geonode_scraper_sdk import ExtractRequest, OutputFormat, ProcessingMode, ProxySettings, ProxyType

request = ExtractRequest(
    url="https://example.com",
    formats=[OutputFormat.HTML, OutputFormat.MARKDOWN],
    render_js=True,
    processing_mode=ProcessingMode.SYNC,
    proxy=ProxySettings(country="US", type=ProxyType.RESIDENTIAL),
    headers={"User-Agent": "geonode-scraper-sdk-demo"},
)

API Reference

ExtractionApi.extract_v1_extract_post(extract_request)
ExtractionApi.get_job_result_v1_extract_job_id_get(job_id)
ExtractionApi.list_jobs_v1_extract_jobs_get(job_id=None, url=None, status=None, output=None, start_date=None, end_date=None, page=None, page_size=None)
StatisticsApi.get_statistics_v1_statistics_get(start_date=None, end_date=None)
SystemApi.health_check_health_get()

Advanced Usage

Each generated API method also exposes:

*_with_http_info() to get the deserialized payload together with status and headers
*_without_preload_content() to work with the raw HTTP response directly

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.0

Jun 12, 2026

0.2.0

Jun 5, 2026

This version

0.1.0

Apr 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geonode_scraper_sdk-0.1.0.tar.gz (32.1 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

geonode_scraper_sdk-0.1.0-py3-none-any.whl (62.7 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file geonode_scraper_sdk-0.1.0.tar.gz.

File metadata

Download URL: geonode_scraper_sdk-0.1.0.tar.gz
Upload date: Apr 10, 2026
Size: 32.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for geonode_scraper_sdk-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8b04e39ec3c04c5754f6fe15f1023c58ba6e41f8d7121af47d6743572289da79`
MD5	`8c25730685262752ffa9342b099c2f91`
BLAKE2b-256	`c85c81d0ec68cd8084f26ed49521d6b9442a0437b68dc0af1f504652ca173b65`

See more details on using hashes here.

Provenance

The following attestation bundles were made for geonode_scraper_sdk-0.1.0.tar.gz:

Publisher: python-sdk-publish.yml on geonodecom/scraper-api-sdks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: geonode_scraper_sdk-0.1.0.tar.gz
- Subject digest: 8b04e39ec3c04c5754f6fe15f1023c58ba6e41f8d7121af47d6743572289da79
- Sigstore transparency entry: 1270753991
- Sigstore integration time: Apr 10, 2026
Source repository:
- Permalink: geonodecom/scraper-api-sdks@b1cda306deeff17df9d21abef600f8712367f0f9
- Branch / Tag: refs/heads/main
- Owner: https://github.com/geonodecom
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-sdk-publish.yml@b1cda306deeff17df9d21abef600f8712367f0f9
- Trigger Event: push

File details

Details for the file geonode_scraper_sdk-0.1.0-py3-none-any.whl.

File metadata

Download URL: geonode_scraper_sdk-0.1.0-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 62.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for geonode_scraper_sdk-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b8ab23b24ef3d803d166cea8f55508ff6a382076fe99ee674df5936e13bd4aaa`
MD5	`349a5194f48e1be50a389cb04dde3582`
BLAKE2b-256	`40925520f7e35df02901018bacf87fbc12afd89610e67d109bbc85955a07cec2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for geonode_scraper_sdk-0.1.0-py3-none-any.whl:

Publisher: python-sdk-publish.yml on geonodecom/scraper-api-sdks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: geonode_scraper_sdk-0.1.0-py3-none-any.whl
- Subject digest: b8ab23b24ef3d803d166cea8f55508ff6a382076fe99ee674df5936e13bd4aaa
- Sigstore transparency entry: 1270754035
- Sigstore integration time: Apr 10, 2026
Source repository:
- Permalink: geonodecom/scraper-api-sdks@b1cda306deeff17df9d21abef600f8712367f0f9
- Branch / Tag: refs/heads/main
- Owner: https://github.com/geonodecom
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-sdk-publish.yml@b1cda306deeff17df9d21abef600f8712367f0f9
- Trigger Event: push

geonode-scraper-sdk 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Geonode Scraper SDK

Requirements

Installation

Configuration And Authentication

Quick Start

Async Workflow

Error Handling

Request Options

API Reference

Advanced Usage

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance