Official Kadoa SDK for Python - Web data extraction and automation
Project description
Kadoa SDK for Python
Official Python SDK for the Kadoa API, providing easy integration with Kadoa's web data extraction platform.
Installation
We recommend using uv, a fast and modern Python package manager:
uv add kadoa-sdk
# or
uv pip install kadoa-sdk
Alternatively, you can use traditional pip:
pip install kadoa-sdk
Requirements: Python 3.11 or higher
Quick Start
from kadoa_sdk import KadoaClient, KadoaClientConfig
from kadoa_sdk.extraction.types import ExtractionOptions
client = KadoaClient(
KadoaClientConfig(
api_key='your-api-key'
)
)
# AI automatically detects and extracts data
result = client.extraction.run(
ExtractionOptions(
urls=['https://sandbox.kadoa.com/ecommerce'],
name='My First Extraction'
)
)
print(f"Extracted {len(result.data)} items")
That's it! With the SDK, data is automatically extracted. For more control, specify exactly what fields you want using the builder API.
Advanced Examples
Builder API with Custom Schema
Define exactly what fields to extract using the fluent builder API:
from kadoa_sdk import KadoaClient, KadoaClientConfig
from kadoa_sdk.extraction.types import ExtractOptions
from kadoa_sdk.schemas.schema_builder import SchemaBuilder, FieldOptions
client = KadoaClient(KadoaClientConfig(api_key='your-api-key'))
# Define custom schema
extraction = client.extract(
ExtractOptions(
urls=['https://example.com/products'],
name='Product Extraction',
extraction=lambda schema: (
schema.entity('Product')
.field('title', 'Product title', 'STRING')
.field('price', 'Product price', 'MONEY', FieldOptions(example='$99.99'))
.field('description', 'Product description', 'STRING')
.field('image', 'Product image URL', 'IMAGE', FieldOptions(example='https://example.com/image.jpg'))
)
)
).create()
# Run and wait for completion
finished = extraction.run()
print(f"Extracted {len(finished.fetch_data().data)} products")
Notifications Setup
Configure notifications to be alerted when workflows complete:
from kadoa_sdk.notifications import NotificationOptions
extraction = client.extract(
ExtractOptions(
urls=['https://example.com'],
name='Monitored Extraction'
)
).with_notifications(
NotificationOptions(
events=['workflow_finished', 'workflow_failed'],
channels={'email': True}
)
).create()
finished = extraction.run()
Error Handling
Handle errors gracefully with proper exception types:
from kadoa_sdk import KadoaClient, KadoaClientConfig
from kadoa_sdk.core import KadoaSdkError, KadoaHttpError
from kadoa_sdk.extraction.types import ExtractionOptions
try:
result = client.extraction.run(
ExtractionOptions(
urls=['https://example.com'],
name='My Extraction'
)
)
except KadoaSdkError as e:
print(f"SDK Error: {e.message}")
print(f"Error Code: {e.code}")
if e.details:
print(f"Details: {e.details}")
except KadoaHttpError as e:
print(f"HTTP Error: {e.message}")
print(f"Status: {e.http_status}")
print(f"Endpoint: {e.endpoint}")
except Exception as e:
print(f"Unexpected error: {e}")
Paginated Data Fetching
Fetch data in pages for large datasets:
from kadoa_sdk.extraction.types import FetchDataOptions
# Fetch first page
result = client.extraction.fetch_data(
FetchDataOptions(
workflow_id='workflow-123',
page=1,
limit=50
)
)
print(f"Page {result.pagination.page} of {result.pagination.total_pages}")
print(f"Total records: {result.pagination.total_count}")
# Fetch all data automatically
all_data = client.extraction.fetch_all_data(
FetchDataOptions(workflow_id='workflow-123', limit=100)
)
print(f"Fetched {len(all_data)} total records")
Async Data Fetching
Process large datasets efficiently with async generators:
import asyncio
from kadoa_sdk.extraction.types import FetchDataOptions
async def process_all_pages():
async for page in client.extraction.fetch_data_pages(
FetchDataOptions(workflow_id='workflow-123', limit=100)
):
print(f"Processing page {page.pagination.page}")
for record in page.data:
# Process each record
process_record(record)
asyncio.run(process_all_pages())
Documentation
For comprehensive documentation, examples, and API reference, visit:
- Full Documentation - Complete guide with examples
- API Reference - Detailed API documentation
- GitHub Examples - Working code examples
Requirements
- Python 3.11 or higher
- Dependencies are automatically installed
Support
- Documentation: docs.kadoa.com
- Support: support@kadoa.com
- Issues: GitHub Issues
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kadoa_sdk-0.8.0rc7.tar.gz.
File metadata
- Download URL: kadoa_sdk-0.8.0rc7.tar.gz
- Upload date:
- Size: 233.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d94c8eb3e062ff94df6edc660c9da1a96c61ff80085af737ee201003bce4de0
|
|
| MD5 |
2a44dce4cf0bbcba33257d38a6061701
|
|
| BLAKE2b-256 |
2db8a74a2bb79ed27f2633697a41e44478127d95201520b0611d2aea7bdd3ebd
|
File details
Details for the file kadoa_sdk-0.8.0rc7-py3-none-any.whl.
File metadata
- Download URL: kadoa_sdk-0.8.0rc7-py3-none-any.whl
- Upload date:
- Size: 776.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ba298db2c98de420936f4b453112535339895bcbf4d2621ae27f0a8c943d77b
|
|
| MD5 |
35ad7a693849f8c7cf7947664bc0b56a
|
|
| BLAKE2b-256 |
b83a55b7a119246b474d27b913d3a1a2b8798a2458aa0a3fb5a3b899e11996e7
|