Python SDK for Grabba API
Project description
Grabba Python SDK
Grabba Python SDK provides a simple and intuitive interface for scheduling web data extraction jobs, retrieving job results, and managing your extraction workflows.
Installation
Install the SDK using pip:
pip install grabba
Basic Setup
Import the client and required types
from grabba import Grabba, Job, JobNavigationType, JobSchedulePolicy, JobTaskType
Initialize a client instance
grabba = Grabba(api_key="your-api-key", region="US") # Optional: Defaults to US
Methods
extract
extract(job: Job) -> Dict
Schedules a new web data extraction job.
Parameters:
job: AJobobject containing the extraction configuration.
Returns:
- A dictionary containing the response
status,message, andjob_result.
Example:
job = Job(
url="https://docs.grabba.dev/home",
schedule={"policy": JobSchedulePolicy.IMMEDIATELY.value},
navigation={"type": JobNavigationType.NONE},
tasks=[{
"type": JobTaskType.WEB_PAGE_AS_MARKDOWN.value,
"options": { "onlyMainContent": True }
}],
)
response = grabba.extract(job)
print(f"Job completed with status: {response['status']}")
schedule_job
schedule_job(job_id: str) -> Dict
Schedules an existing job for execution.
Parameters:
job_id: The ID of the job to schedule.
Returns:
- A dictionary containing the response
status,message, andjob_result.
Example:
response = grabba.schedule_job("12345")
print(f"Job completed with status: {response['status']}")
get_jobs
get_jobs() -> GetJobsResponse
Retrieves a list of all jobs associated with the API key.
Returns:
- A list of
Jobobjects.
Example:
jobs = grabba.get_jobs()
for job in jobs:
print(job)
get_job
get_job(job_id: str) -> GetJobResponse
Retrieves details of a specific job by its ID.
Parameters:
job_id: The ID of the job to retrieve.
Returns:
- A
JobDetailobject containing job details.
Example:
job = grabba.get_job("12345")
print(job)
get_job_result
get_job_result(job_result_id: str) -> JobResult
Retrieves the results of a specific job by its result ID.
Parameters:
job_result_id: The ID of the job result to retrieve.
Returns:
- A
JobResultobject.
Example:
result = grabba.get_job_result("67890")
print(result)
get_available_regions
get_available_regions() -> List[Dict[str, PuppetRegion]]
Retrieves a list of available regions for Web Agent execution.
Returns:
- A list of region objects.
Example:
regions = grabba.get_available_regions()
print(regions)
Types
Job
Represents a web data extraction job.
@dataclass
class Job:
url: str
tasks: List[JobTask]
schedule: Optional[JobSchedule] = None
navigation: Optional[JobNavigation] = None
puppet_config: Optional[WebAgentConfig] = None
JobTask
Represents a single task in an extraction job.
@dataclass
class JobTask:
type: JobTaskType
options: Optional[Union[SpecificDataExtractionOptions, WebpageAsMarkdownOptions, WebScreenCaptureOptions]] = None
JobTaskType
Enumeration of available job task types.
class JobTaskType(str, Enum):
WEB_PAGE_AS_HTML = "webPageAsHTML"
WEB_PAGE_METADATA = "webPageMetadata"
WEB_SCREEN_CAPTURE = "webScreenCapture"
WEB_PAGE_AS_MARKDOWN = "webPageAsMarkdown"
SPECIFIC_DATA_EXTRACTION = "specificDataExtraction"
JobResult
Represents the result of a job.
@dataclass
class JobResult:
id: str
output: Dict[str, Dict]
start_time: datetime
stop_time: datetime
duration: str
WebAgentConfig
Configuration for Web Agent.
@dataclass
class WebAgentConfig:
region: PuppetRegion
device_type: Optional[PuppetDeviceType] = None
viewport: Optional[Dict] = None
Error Handling
The SDK throws errors for:
- Invalid API keys
- Failed API requests
- Missing or invalid parameters
Example:
try:
response = grabba.extract(job)
if response["status"] == "success":
print("Results data:", response["output"]["data"])
else:
print("Error message:", response["message"])
except Exception as err:
print("Error:", err)
Contributing
Contributions are welcome! Please open an issue or submit a pull request on GitHub.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file grabba-0.0.3.tar.gz.
File metadata
- Download URL: grabba-0.0.3.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.1 CPython/3.12.3 Linux/6.11.0-19-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03756fe2ab85bc2d1cf63cdd55de57195f1a4c05531c1cbfbd34d03679d2967d
|
|
| MD5 |
f01c4d11473446b4c119279439fc9232
|
|
| BLAKE2b-256 |
fa09163fb2325426e78fa2a017e2fe5476fafc6c97f10c972f32294ab1ac522e
|
File details
Details for the file grabba-0.0.3-py3-none-any.whl.
File metadata
- Download URL: grabba-0.0.3-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.1 CPython/3.12.3 Linux/6.11.0-19-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b85b9f920a95bfc0a78d825be9c9a72ee329221456b09169cd762075a31dd84a
|
|
| MD5 |
c0fea4e873f382ca64317af2fe7bf316
|
|
| BLAKE2b-256 |
461cc0cde5f7d00f14443aadcd363959634c1017179b251e0028294214595917
|