Bodo Platform SDK
Project description
Bodo Platform SDK
A simple SDK for Bodo Cloud Platform.
List of contents:
- Getting Started
- Job resource
- Cluster resource
- Workspace resource
- Cloud config
- Instance Role Manager
- Billing module
Getting started
The first step is to create an API Token in the Bodo Platform for Bodo SDK authentication. Navigate to API Tokens in the Admin Console to generate a token. Copy and save the token's Client ID and Secret Key and use them for BodoClient definition:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
Alternatively, set BODO_CLIENT_ID
and BODO_SECRET_KEY
environment variables
to avoid requiring keys:
from bodosdk.client import get_bodo_client
client = get_bodo_client()
Other bodo client options
- print_logs - default False, if enabled all API calls will be printed
from bodosdk.client import get_bodo_client
from bodosdk.models import WorkspaceKeys
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys, print_logs=True)
Job resource
Module responsible for managing jobs in a workspace
- Bodo Platform Batch Jobs
- Create batch job definition
- List batch job definitions
- Get batch job definition by id
- Get batch job definition by name
- Remove batch job definition
- Submit a batch job run
- List batch job runs
- Get batch job run
- Cancel batch job run
- Cancel all job runs on a cluster UUIDs
- Check batch job run status
- Submit SQL job run
- Job Run waiter
- get job logs
Bodo Platform Batch Jobs
Create batch job definition
BodoClient.job.create_batch_job_definition(job_definition: CreateBatchJobDefinition)
Creates batch job definition in the given workspace.
- Example 1: Create batch job definition for a workspace source script
from bodosdk.models import WorkspaceKeys,BatchJobDefinition
from bodosdk.client import get_bodo_client
from bodosdk.models.job import CreateBatchJobDefinition, JobConfig, JobSource, JobSourceType, SourceCodeType, \
WorkspaceDef, RetryStrategy
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
workspace_source_def = JobSource(
type=JobSourceType.WORKSPACE,
definition=WorkspaceDef(
path="Example-path/batch-job-defs",
),
)
retry_strategy = RetryStrategy(
num_retries=1,
retry_on_timeout=False,
delay_between_retries=2,
)
jobConfig = JobConfig(
source=workspace_source_def,
source_code_type=SourceCodeType.PYTHON,
sourceLocation="test.py",
args=None,
retry_strategy=retry_strategy,
timeout=10000,
env_vars=None,
)
createBatchJobDef = CreateBatchJobDefinition(
name="test-job",
config=jobConfig,
description="test-batch-job-description-attempt",
cluster_config={
"bodoVersion": "2023.1.3",
"instance_type": "c5.2xlarge",
"workers_quantity": 2,
"accelerated_networking": False,
}, )
jobdef = client.job.create_batch_job_definition(createBatchJobDef)
-
Example 2: Create batch job definition for a git source script
from bodosdk.models import WorkspaceKeys, CreateBatchJobDefinition, BatchJobDefinition from bodosdk.client import get_bodo_client from bodosdk.models.job import CreateBatchJobDefinition, JobConfig, JobSource, JobSourceType, SourceCodeType, \ WorkspaceDef, RetryStrategy keys = WorkspaceKeys( client_id='XYZ', secret_key='XYZ' ) client = get_bodo_client(keys) git_source_def = JobSource( type=JobSourceType.GIT, definition=GitDef( repo_url='https://github.com/Bodo-inc/Bodo-examples.git', username='XYZ', token='XYZ' ), ) retry_strategy = RetryStrategy( num_retries=1, retry_on_timeout=False, delay_between_retries=2, ) jobConfig = JobConfig( source=git_source_def, source_code_type=SourceCodeType.PYTHON, sourceLocation="test.py", args=None, retry_strategy=retry_strategy, timeout=10000, env_vars=None, ) createBatchJobDef = CreateBatchJobDefinition( name="test-job", config=jobConfig, description="test-batch-job-description-attempt", cluster_config={ "bodoVersion": "2023.1.3", "instance_type": "c5.2xlarge", "workers_quantity": 2, "accelerated_networking": False, }, ) jobdef = client.job.create_batch_job_definition(createBatchJobDef)
-
**Example 3. Create batch job definition for a S3 source script
To run a script file located on an S3 bucket, the cluster must have the required permissions to read the files from S3. This can be provided by creating an Instance Role with access to the required S3 bucket. Please make sure to specify an Instance Role that should be attached to the Job Cluster. The policy attached to the roles should provide access to both the bucket and its contents.
from bodosdk.client import get_bodo_client
from bodosdk.models.job import RetryStrategy
from bodosdk.models import (CreateRoleDefinition, InstanceRole, JobConfig, JobSource, JobSourceType, S3Source, SourceCodeType, WorkspaceKeys, CreateBatchJobDefinition)
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
git_source_def = JobSource
keys = WorkspaceKeys(client_id='XYZ', secret_key='XYZ')
role_definition = CreateRoleDefinition(
name="test-sdk-role-creation",
description="testing",
data=InstanceRole(role_arn="arn:aws:iam::accountID:role/name")
)
list_of_instance_roles = client.instance_role.list()
role_to_use = None
for role in list_of_instance_roles:
if role.name == 'role_i_want_to_use':
role_to_use = role
break
s3_job_source = JobSource(
type=JobSourceType.S3,
definition=S3Source(
bucket_path='s3://path-to-my-bucket/my_job_script_folder/',
type=JobSourceType.S3,
bucket_region='region',
),
)
createBatchJobDef = CreateBatchJobDefinition(
name="test-job",
config=JobConfig(
source=s3_job_source,
source_code_type=SourceCodeType.PYTHON,
sourceLocation="to_sql.py",
args=None,
timeout=10000,
env_vars=None),
description="test-batch-job-description-attempt",
cluster_config={
"bodo_version": "2023.1.3",
"instance_type": "c5.2xlarge",
"workers_quantity": 2,
"accelerated_networking": False,
"instance_role_uuid": role_to_use.uuid,
})
jobdef = client.job.create_batch_job_definition(createBatchJobDef)
List batch job definitions
BodoClient.job.list_batch_job_definitions(page: int, size: int, order: PaginationOrder)
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
page |
int |
The page number has a lower limit of 1, default value 1 | No |
size |
int |
The size has a maximum allowed value of 100 and default value of 5 | No |
order |
PaginationOrder |
The order in which the elements would be displayed. Default is created_at ASC | No |
Returns: Return lists all batch job runs in the given workspace filtered by given parameters - List[JobRunResponse]
Lists all batch job definitions in the given workspace.
Example:
from typing import List
from bodosdk.models import PersonalKeys, WorkspaceKeys, JobConfig, SourceCodeType, RetryStrategy, JobSourceType, \
WorkspaceDef, CreateBatchJobDefinition
from bodosdk.client import get_bodo_client
from bodosdk.models.job import CreateJobRun, JobSource, JobRunStatus, BatchJobDefinitionResponse
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
jobdefs: List[BatchJobDefinitionResponse] = client.job.list_batch_job_definitions()
Get batch job definition by id
BodoClient.job.get_batch_job_definition(job_definition_id: str)
Gets specific batch job definition by id.
Example:
from typing import List
from bodosdk.models import PersonalKeys, WorkspaceKeys, JobConfig, SourceCodeType, RetryStrategy, JobSourceType, \
WorkspaceDef, CreateBatchJobDefinition
from bodosdk.client import get_bodo_client
from bodosdk.models.job import CreateJobRun, JobSource, JobRunStatus, BatchJobDefinitionResponse
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
jobdef: BatchJobDefinitionResponse = client.job.get_batch_job_definition('04412S5b-300e-42db-84d4-5f22f7506594')
Get batch job definition by name
BodoClient.job.get_batch_job_definition_by_name(name: str)
Return the batch job definition based on the name provided
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.job import BatchJobDefinitionResponse
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
jobdef: BatchJobDefinitionResponse = client.job.get_batch_job_definition('batch-job-1')
Remove batch job definition
BodoClient.job.remove_batch_job_definition(job_definition_id: str)
Removes specific batch job definition by id.
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
client.job.remove_batch_job_definition('04412S5b-300e-42db-84d4-5f22f7506594')
Submit a batch job run
BodoClient.job.submit_batch_job_run(job_run: CreateJobRun)
Submits a job run for a given batch job definition.
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.job import CreateJobRun
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
client.job.submit_batch_job_run(CreateJobRun(batchJobDefinitionUUID='04412S5b-300e-42db-84d4-5f22f7506594', clusterUUID='12936Q5z-109d-89yi-23c4-3d91u1219823'))
List batch job runs
BodoClient.job.list_batch_job_runs()
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
batch_job_ids |
List[UUID] |
List of Ids of the batch job definitions | No |
statuses |
List[JobRunStatus] |
List of Job Run status as filters | No |
cluster_ids |
List[UUID] |
List of cluster ids as filters | No |
started_at |
List[UUID] |
started at time stamp filter | No |
finished_at |
List[UUID] |
finished at timestamp filter | No |
page |
int |
The page number as integer. Default page 1 | No |
page_size |
int |
The number of elements in a page. Default page_size = 10 | No |
order |
PaginationOrder |
The order of listing for the job run ordered by created_at | No |
Returns: Return lists all batch job runs in the given workspace filtered by given parameters - List[JobRunResponse]
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.job import JobRunStatus
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
jobruns = client.job.list_batch_job_runs(statuses=[JobRunStatus.FAILED],
cluster_ids=['ba62e653-312a-490e-9457-71d7bc096959'])
List batch job runs by batch job name
BodoClient.job.list_job_runs_by_batch_job_name()
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
batch_job_names |
List[str] |
List of Ids of the batch job definitions | No |
statuses |
List[JobRunStatus] |
List of Job Run Statuses | No |
cluster_ids |
List[str] |
Cluster IDs filter for the batch job run | No |
started_at |
List[str] |
Started at time filter | No |
finished_at |
List[str] |
Finished at time filter | No |
page |
List[str] |
Page no to fetch. Default value 1 | No |
page_size |
int |
No of elements in a page. Default value 10 | No |
order |
PaginationOrder |
Job run sorting order by created_at. Default value ASC | No |
Lists all batch job runs in the given workspace filtered by given parameters.
Returns: Returns a list of JobRunResponse - List[JobRunResponse]
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.job import JobRunStatus
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
jobruns = client.job.list_job_runs_by_batch_job_name(batch_job_names=['production-job-1'], statuses=[JobRunStatus.FAILED], cluster_ids=['ba62e653-312a-490e-9457-71d7bc096959'])
Get batch job run
BodoClient.job.get_batch_job_run(uuid: str)
Returns batch job run based on the job_run_id provided.
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Returns batch job run for the id | Yes |
Returns:
Parameter | Type | Description |
---|---|---|
name |
str |
The name of the job run |
clusterUUID |
Optional[Union[UUID, None]] |
The cluster UUID on which the job run was executed |
cluster |
Optional[JobCluster] |
The cluster on which the job run was executed |
type |
JobRunType |
The type of Job Run |
config |
JobConfig |
The JobConfig of the cluster |
submittedAt |
datetime |
The time at which the job was submitted |
finishedAt |
Optional[Union[datetime, None]] |
The time at which the job finished |
startedAt |
Optional[Union[datetime, None]] |
The time at which job started |
status |
JobRunStatus |
Job run status |
batchJobDefinitionConfigOverrides |
Optional[JobConfigOverride] |
The job run config override |
batch_job_definition |
Optional[BatchJobDefinitionUUID] |
UUID of batch job definition |
reason |
Optional[str] |
The reason for job statue change |
submitter |
Optional[str] |
Email of the person who submitted the job |
Gets specific batch job run by job_run_id
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
jobrun = client.job.get_batch_job_run('04412S5b-300e-42db-84d4-5f22f7506594')
Cancel batch job run
BodoClient.job.cancel_batch_job_run(uuid: Union[str, UUID])
Cancels specific batch job run by job_run_id
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
uuid: Union[str, UUID] |
UUID of the job run | Yes |
Returns:
None returned on success
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
client.job.cancel_batch_job_run('04412S5b-300e-42db-84d4-5f22f7506594')
Cancel all job runs on a cluster UUIDs
BodoClient.job.cancel_all_job_runs(cluster_uuids: Union[List[str], List[UUID]])
Cancels all the job runs for a set of cluster UUIDs provided as a function parameter
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
cluster_uuids |
Union[List[str], List[UUID]] |
Cancels all job runs on a list of cluster UUIDs | Yes |
Returns:
None returned on success
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
client.job.cancel_all_job_runs(['04412S5b-300e-42db-84d4-5f22f7506594'])
Check batch job run status
BodoClient.job.check_job_run_status(batch_job_run_uuid: Union[str, UUID])->JobRunStatus
Checks status of specific batch job run by id.
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
batch_job_run_uuid |
Union[str, UUID] |
Fetches the | Yes |
Returns:
JobRunStatus returned on success
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
status = client.job.check_job_run_status('04412S5b-300e-42db-84d4-5f22f7506594')
Submit SQL job run
BodoClient.job.submit_sql_job_run(create_sql_job_run: CreateSQLJobRun)
Submits a SQL query as a job run. The SQL job run contains the SQL query text and the cluster UUID on which the query will be executed along with catalog and the query tags. The query tags field accepts a JSON which returns the query tags associated with a query.
!!! note This needs a database [catalog][catalog] to be configured in the workspace.
Parameters(CreateSQLJobRun):
Parameter | Type | Description | Required | Default |
---|---|---|---|---|
Job Run Type |
JobRunType |
Type of Job Run | Yes | None |
clusterUUID |
str |
The cluster UUID on which the job run will be executed | No | None |
catalog |
str |
The catalog which the query use | Yes | None |
sql_query_text |
str |
The SQL query text | Yes | None |
query_tags |
Dict[str, str] |
The query tags associated with the query | No | {} |
timeout |
int |
The timeout for the query in minutes | No | 60 |
env_vars |
Dict[str, str] |
The environment variables for the query | No | {} |
retry_strategy |
RetryStrategy |
The retry strategy for the query | No | None |
cluster_config |
JobClusterDefinition |
Job Cluster Configuration | No | None |
args |
Union[str, Dict] |
The arguments for the query | No | {} |
Returns:
Parameter | Type | Description |
---|---|---|
name |
str |
The name of the job run |
clusterUUID |
Optional[Union[UUID, None]] |
The cluster UUID on which the job run was executed |
cluster |
Optional[JobCluster] |
The cluster on which the job run was executed |
type |
JobRunType |
The type of Job Run |
config |
JobConfig |
The JobConfig of the cluster |
submittedAt |
datetime |
The time at which the job was submitted |
finishedAt |
Optional[Union[datetime, None]] |
The time at which the job finished |
startedAt |
Optional[Union[datetime, None]] |
The time at which job started |
status |
JobRunStatus |
Job run status |
batchJobDefinitionConfigOverrides |
Optional[JobConfigOverride] |
The job run config override |
batch_job_definition |
Optional[BatchJobDefinitionUUID] |
UUID of batch job definition |
reason |
Optional[str] |
The reason for job statue change |
submitter |
Optional[str] |
Email of the person who submitted the job |
Example:
from bodosdk.models import WorkspaceKeys, CreateSQLJobRun
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_client(keys)
job_run = client.job.submit_sql_job_run(CreateSQLJobRun(
clusterUUID='04412S5b-300e-42db-84d4-5f22f7506594',
catalog="SNOWFLAKE_CATALOG",
sql_query_text="SELECT * FROM PUBLIC.TABLE LIMIT 10",
query_tags={"DAG_ID":"398482", "MACHINE_ID": "1934"}))
Job Run waiter
BodoClient.job.get_job_run_waiter()
Returns a waiter object that waits until the job run uuid specified finishes.
waiter.wait()
To wait for job run to be finished, invoke the waiter.wait() function, which can take the following parameters.
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
str |
Create a SQL Job Run based on the parameters that are passed | Yes |
on_success |
Callable |
Callable executed on success with job object passed as argument | No |
on_failure |
Callable |
Callable executed on failure with job object passed as argument | No |
on_timeout |
Callable |
Callable executed on failure with job_uuid passed as argument | No |
check_period |
int |
Time in seconds between status checks for the wait function. Default is 10 seconds | No |
timeout |
int |
Time in seconds after which timeout error will be raised. Default is none | No |
Returns:
Parameter | Type | Description |
---|---|---|
name |
str |
The name of the job run |
clusterUUID |
Optional[Union[UUID, None]] |
The cluster UUID on which the job run was executed |
cluster |
Optional[JobCluster] |
The cluster on which the job run was executed |
type |
JobRunType |
The type of Job Run |
config |
JobConfig |
The JobConfig of the cluster |
submittedAt |
datetime |
The time at which the job was submitted |
finishedAt |
Optional[Union[datetime, None]] |
The time at which the job finished |
startedAt |
Optional[Union[datetime, None]] |
The time at which job started |
status |
JobRunStatus |
Job run status |
batchJobDefinitionConfigOverrides |
Optional[JobConfigOverride] |
The job run config override |
batch_job_definition |
Optional[BatchJobDefinitionUUID] |
UUID of batch job definition |
reason |
Optional[str] |
The reason for job statue change |
submitter |
Optional[str] |
Email of the person who submitted the job |
from typing import Callable
def wait(
self,
uuid,
on_success: Callable = None,
on_failure: Callable = None,
on_timeout: Callable = None,
check_period=10,
timeout=None
):
pass
By default, returns job model if no callbacks is provided. There is option to pass callable objects as following parameters:
Example 1. Success/Failure callbacks:
from bodosdk.models import WorkspaceKeys, CreateJobRun
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
input_job = CreateJobRun(clusterUUID='<cluster-uuid>', batchJobDefinitionUUID='<batch-job-definition-uuid>')
job_run = client.job.submit_batch_job_run(input_job)
waiter = client.job.get_job_run_waiter()
def success_callback(job):
print("in success callback", job.status)
def failure_callback(job):
print('in failure callback', job.status)
result = waiter.wait(job_run.uuid, on_success=success_callback, on_failure=failure_callback)
Example 2. Timeout callback:
from bodosdk.models import WorkspaceKeys, CreateJobRun
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
input_job = CreateJobRun(clusterUUID='<cluster-uuid>', batchJobDefinitionUUID='<batch-job-definition-uuid>')
job_run = client.job.submit_batch_job_run(input_job)
waiter = client.job.get_job_run_waiter()
def timeout_callback(job_uuid):
print(f'Waiter timeout for {job_uuid}')
return job_uuid
result = waiter.wait(job_run.status, on_timeout=timeout_callback, timeout=1)
Get job logs
BodoClient.job.get_job_logs(job_uuid)
Returns specific stdout and stderr urls along with expiration timestamp in workspace.
Also, downloads specific stdout and stderr logs in workspace and additionally provides links as well
- stdout: Standard output of the program execution
- stderr: Standard error messages of the program execution
Downloads files as below and overrides if they already exist
- stdout_{uuid}.txt
- stderr_{uuid}.txt
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.models.job import JobRunLogsResponse
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
logs: JobRunLogsResponse = client.job.get_job_logs('8c32aec5-7181-45cc-9e17-8aff35fd269e')
Cluster resource
Module responsible for managing clusters in workspace.
- get available instance types
- get available images
- create cluster
- list clusters
- get cluster
- remove cluster
- scale cluster
- stop cluster
- restart cluster
- pause cluster
- resume cluster
- modify cluster
- get tasks for cluster
Availability Zone Selection
When creating a cluster, you can specify the availability zone in which the cluster will be created. However, cluster creation might fail if the availability zone does not have sufficient capacity to create the cluster. Even after the cluster is created, resuming or scaling it might fail if the availability zone does not have sufficient capacity to resume or scale the cluster.
Bodo supports an auto_az
flag in cluster creation which is by default set to True
. When enabled
create, scale and resume tasks attempt to automatically select an availability zone with sufficient capacity for said cluster. If you want to disable this behavior, set auto_az
to False
in the ClusterDefinition
object.
Available instance types
BodoClient.cluster.get_available_instance_types(region:str) -> Dict[str, InstanceCategory]
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
region |
str |
Azure / AWS region | Yes |
Returns: Return dictionary of instance types available for given region grouped by instance category.
InstanceCategory
:
Field | Type | Description |
---|---|---|
name |
str |
Name of instance instance category |
instance_types |
Dict[str, InstanceType] |
Dict with all instances in the category |
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
instance_types = client.cluster.get_available_instance_types('us-west-2')
Available images
BodoClient.cluster.get_available_images(region:str) -> Dict[str, BodoImage]
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
region |
str |
Azure / AWS region | Yes |
Returns:
Return dictionary of images available for given region where key is bodo version and value is BodoImage
:
BodoImage
:
Field | Type | Description |
---|---|---|
image_id |
str |
Image id |
bodo_version |
str |
Bodo version |
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
images = client.cluster.get_available_images('us-west-2')
Create cluster
BodoClient.cluster.create(cluster_definition: ClusterDefinition) -> ClusterResponse
Creates a cluster in the workspace based on the instance type, no of workers and whether the instance is a spot instance. The cluster can be configured to have an auto-pause and auto-stop time in minutes to pause and stop the cluster when there is no activity.
If you choose to create a cluster with spot instances, please note:
- Spot instance clusters are only supported on AWS at this moment.
- Spot instance has lower cost at the expense of reliability. We recommend to use instance types with lower reclaim rate according to AWS spot instance advisor.
- Spot instance clusters cannot be paused/resumed. Please use stop/restart instead
- Auto pause on spot instance clusters is not allowed. Please use auto stop instead
- Spot instance clusters will have a 60-minute auto stop by default to avoid accidental long-running clusters
Parameter: cluster_definition: ClusterDefinition
Field | Type | Description | Required |
---|---|---|---|
name |
str |
Cluster name | Yes |
instance_type |
str |
Instance type of cluster nodes | Yes |
workers_quantity |
int |
Number of nodes in the cluster | Yes |
bodo_version |
str |
Bodo version available on the cluster | Yes |
description |
Optional[str] |
Cluster description | No |
image_id |
Optional[str] |
Image id used on cluster nodes | No |
auto_stop |
Optional[int] |
Cluster auto stop value [min] | No |
auto_pause |
Optional[int] |
Cluster auto pause value [min] | No |
availability_zone |
Optional[str] |
Availability zone for cluster | No |
aws_deployment_subnet_id |
Optional[str] |
[AWS] Subnet id for cluster | No |
instance_role_uuid |
Optional[str] |
[AWS] Instance role | No |
auto_az |
Optional[bool] |
[AWS] Whether the cluster has Auto AZ enabled | No |
use_spot_instance |
Optional[bool] |
[AWS] Whether the cluster uses spot instance | No |
is_job_dedicated |
Optional[bool] |
whether the cluster is a job dedicated cluster | No |
accelerated_networking |
Optional[bool] |
Whether the cluster uses accelerated networking (e.g. EFA on AWS) | No |
Returns:
Object ClusterResponse:
Field | Type | Description |
---|---|---|
name |
str |
Cluster name |
uuid |
Union[str, UUID] |
Cluster UUID |
status |
ClusterStatus |
Cluster status |
description |
Optional[str] |
Cluster description |
instance_type |
str |
Instance type of cluster nodes |
workers_quantity |
int |
Number workers in clusters |
auto_stop |
Optional[int] |
Cluster auto stop value [min] |
auto_pause |
Optional[int] |
Cluster auto pause value [min] |
bodo_version |
Optional[str] |
Bodo version used in cluster VMs |
image_id |
Optional[str] |
Image id used cluster VMs |
cores_per_worker |
Optional[int] |
Number of cores per worker |
accelerated_networking |
Optional[bool] |
Whether the cluster uses accelerated networking |
created_at |
str |
Date when cluster was created |
last_known_activity |
Optional[str] |
Date of last known cluster activity |
is_job_dedicated |
Optional[bool] |
whether the cluster is a job dedicated cluster |
node_metadata |
Optional[object] |
Metada data about nodes (IP, instance id) |
asg_metadata |
Optional[object] |
[AWS] Auto scaling group metadata |
aws_deployment_subnet_id |
Optional[str] |
[AWS] Subnet id for cluster |
auto_az |
Optional[bool] |
[AWS] Whether the cluster has Auto AZ enabled |
Example: Create a regular cluster
from bodosdk.models import WorkspaceKeys, ClusterDefinition
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
cluster_definition = ClusterDefinition(
name="test",
instance_type="c5.large",
workers_quantity=2,
use_spot_instance=False,
auto_pause=100,
image_id="ami-038d89f8d9470c862",
bodo_version="2022.4",
description="my desc here",
auto_az=False,
)
result_create = client.cluster.create(cluster_definition)
Example: Create a spot instance cluster:
from bodosdk.models import WorkspaceKeys, ClusterDefinition
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
cluster_definition = ClusterDefinition(
name="test-spot",
instance_type="c5.large",
workers_quantity=2,
use_spot_instance=True,
auto_stop=100,
image_id="ami-038d89f8d9470c862",
bodo_version="2022.4",
description="my desc here",
auto_az=False,
)
result_create = client.cluster.create(cluster_definition)
List clusters
BodoClient.cluster.list(page: int, page_size: int) -> ClusterList
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
page |
int |
The page number has a lower limit of 1, default value 1 | False |
size |
int |
The size has a maximum allowed value of 100 and default value of 10 | False |
Returns: ClusterList
Fields | Type | Description |
---|---|---|
data |
List[ClusterResponse] |
Contains a list of Cluster Response |
metadata |
PageMetadata |
Contains fields page , size , total_pages , total_items , and order |
Object ClusterResponse:
Field | Type | Description |
---|---|---|
name |
str |
Cluster name |
uuid |
Union[str, UUID] |
Cluster UUID |
status |
ClusterStatus |
Cluster status |
description |
Optional[str] |
Cluster description |
instance_type |
str |
Instance type of cluster nodes |
workers_quantity |
int |
Number workers in clusters |
auto_stop |
Optional[int] |
Cluster auto stop value [min] |
auto_pause |
Optional[int] |
Cluster auto pause value [min] |
bodo_version |
Optional[str] |
Bodo version used in cluster VMs |
image_id |
Optional[str] |
Image id used cluster VMs |
cores_per_worker |
Optional[int] |
Number of cores per worker |
accelerated_networking |
Optional[bool] |
Whether the cluster uses accelerated networking |
created_at |
str |
Date when cluster was created |
last_known_activity |
Optional[str] |
Date of last known cluster activity |
is_job_dedicated |
Optional[bool] |
whether the cluster is a job dedicated cluster |
node_metadata |
Optional[object] |
Metada data about nodes (IP, instance id) |
asg_metadata |
Optional[object] |
[AWS] Auto scaling group metadata |
aws_deployment_subnet_id |
Optional[str] |
[AWS] Subnet id for cluster |
auto_az |
Optional[bool] |
[AWS] Whether the cluster has Auto AZ enabled |
Example: Get the first 10 clusters in the workspace using the default parameter values.
from bodosdk.models import WorkspaceKeys, ClusterResponse
from bodosdk.client import get_bodo_client
from typing import List
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
clusters: List[ClusterResponse] = client.cluster.list().data
Example: Get all clusters with chunck of 10 clusters.
from bodosdk.client import get_bodo_client
from bodosdk.models import (
WorkspaceKeys,
)
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
cluster_list = []
page = 1
size = 10
clusters = client.cluster.list(page=page, page_size=size)
print('Total Clusters', clusters.metadata.total_items)
total_items = clusters.metadata.total_items
cluster_list = clusters.data
while len(cluster_list) < total_items:
page += 1
clusters = client.cluster.list(page=page, page_size=size)
cluster_list += clusters.data
Get cluster
BodoClient.cluster.get(uuid : str) -> ClusterResponse
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Cluster UUID | True |
Returns:
Object ClusterResponse:
Field | Type | Description |
---|---|---|
name |
str |
Cluster name |
uuid |
Union[str, UUID] |
Cluster UUID |
status |
ClusterStatus |
Cluster status |
description |
Optional[str] |
Cluster description |
instance_type |
str |
Instance type of cluster nodes |
workers_quantity |
int |
Number workers in clusters |
auto_stop |
Optional[int] |
Cluster auto stop value [min] |
auto_pause |
Optional[int] |
Cluster auto pause value [min] |
bodo_version |
Optional[str] |
Bodo version used in cluster VMs |
image_id |
Optional[str] |
Image id used cluster VMs |
cores_per_worker |
Optional[int] |
Number of cores per worker |
accelerated_networking |
Optional[bool] |
Whether the cluster uses accelerated networking |
created_at |
str |
Date when cluster was created |
last_known_activity |
Optional[str] |
Date of last known cluster activity |
is_job_dedicated |
Optional[bool] |
whether the cluster is a job dedicated cluster |
node_metadata |
Optional[object] |
Metada data about nodes (IP, instance id) |
asg_metadata |
Optional[object] |
[AWS] Auto scaling group metadata |
aws_deployment_subnet_id |
Optional[str] |
[AWS] Subnet id for cluster |
auto_az |
Optional[bool] |
[AWS] Whether the cluster has Auto AZ enabled |
Example:
from bodosdk.models import WorkspaceKeys, ClusterResponse
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
clusters: ClusterResponse = client.cluster.get('<CLUSTER-UUID>')
Remove cluster
BodoClient.client.remove(uuid: Union[str, UUID], force_remove: bool = False, mark_as_terminated: bool = False)
Method removing cluster from platform
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Cluster UUID | True |
force_remove |
Union[str, UUID] |
Try to remove cluster even if there is cluster activity | False |
mark_as_terminated |
Union[str, UUID] |
Mark cluster as removed without removing resources, may be useful if cluster creation failed and normal removal is failing | False |
Returns:
Returns None
if successful. Otherwise, raises exception.
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from typing import List
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
client.cluster.remove('<CLUSTER-UUID>')
Scale cluster
BodoClient.cluster.scale(scale_cluster: ScaleCluster) -> ClusterResponse
Changes number of nodes in cluster (AWS only)
Parameter: scale_cluster: ScaleCluster
Field | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Cluster UUID | Yes |
workers_quantity |
int |
Number of nodes in the cluster | Yes |
Returns:
Object ClusterResponse:
Field | Type | Description |
---|---|---|
name |
str |
Cluster name |
uuid |
Union[str, UUID] |
Cluster UUID |
status |
ClusterStatus |
Cluster status |
description |
Optional[str] |
Cluster description |
instance_type |
str |
Instance type of cluster nodes |
workers_quantity |
int |
Number workers in clusters |
auto_stop |
Optional[int] |
Cluster auto stop value [min] |
auto_pause |
Optional[int] |
Cluster auto pause value [min] |
bodo_version |
Optional[str] |
Bodo version used in cluster VMs |
image_id |
Optional[str] |
Image id used cluster VMs |
cores_per_worker |
Optional[int] |
Number of cores per worker |
accelerated_networking |
Optional[bool] |
Whether the cluster uses accelerated networking |
created_at |
str |
Date when cluster was created |
last_known_activity |
Optional[str] |
Date of last known cluster activity |
is_job_dedicated |
Optional[bool] |
whether the cluster is a job dedicated cluster |
node_metadata |
Optional[object] |
Metada data about nodes (IP, instance id) |
asg_metadata |
Optional[object] |
[AWS] Auto scaling group metadata |
aws_deployment_subnet_id |
Optional[str] |
[AWS] Subnet id for cluster |
auto_az |
Optional[bool] |
[AWS] Whether the cluster has Auto AZ enabled |
Example:
from bodosdk.models import WorkspaceKeys, ScaleCluster, ClusterResponse
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
NEW_WORKERS_QUANTITY = 3
scale_cluster = ScaleCluster(
uuid='<CLUSTER-UUID>',
workers_quantity=NEW_WORKERS_QUANTITY
)
cluster: ClusterResponse = client.cluster.scale(scale_cluster)
Stop cluster
BodoClient.cluster.stop(uuid: Union[str, UUID])
Stops any cluster activity. You will not incur any charges for stopped cluster. You can restart it again at any time.
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Cluster UUID | Yes |
Returns:
Returns None
if successful. Otherwise, raises exception.
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
client.cluster.stop('<CLUSTER-UUID>')
Restart cluster
BodoClient.cluster.restart(uuid: Union[str, UUID])
Restarts cluster. You can restart cluster only if it is stopped.
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Cluster UUID | Yes |
Returns:
Returns None
if successful. Otherwise, raises exception.
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
client.cluster.stop('<CLUSTER-UUID>')
Pause cluster
BodoClient.cluster.pause(uuid: Union[str, UUID])
Pause cluster. You can pause cluster only if it is running.
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Cluster UUID | Yes |
Returns:
Returns None
if successful. Otherwise, raises exception.
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
client.cluster.pause('<CLUSTER-UUID>')
Resume cluster
BodoClient.cluster.resume(uuid: Union[str, UUID])
Resume cluster. You can resume cluster only if it is paused.
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Cluster UUID | Yes |
Returns:
Returns None
if successful. Otherwise, raises exception.
Example:
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
client.cluster.resume('<CLUSTER-UUID>')
Modify Cluster metadata
BodoClient.cluster.modify(modify_cluster: ModifyCluster) -> ClusterResponse
This function can be used to edit cluster metadata for a given cluster. The properties that we can edit are
description, autopause time, autostop time, bodo-version, instance type, instance role, flag for auto availability zone selection and the number of workers.
Changing the number of workers will kick off a scaling event on the cluster, which will resume the cluster if it is in paused state.
The modify function also supports modifying a subset of property part if the ModifyCluster object like listed in the example below.
The cluster modification can only happen when the cluster is in stopped state. The fields that aren't required to
be modified are optional and don't necessarily have to be passed during the call to the API.
Note: Disabling the auto_az
flag without specifying an availability_zone
in the same request might result in the cluster failing.
So make sure to provide a fallback zone to avoid failures.
Parameter: modify_cluster: ModifyCluster
Field | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Cluster UUID | Yes |
auto_stop |
Optional[int] |
Cluster auto stop value [min] | No |
auto_pause |
Optional[int] |
Cluster auto pause value [min] | No |
description |
Optional[str] |
Cluster description | No |
workers_quantity |
Optional[int] |
Number of nodes in the cluster | No |
instance_role_uuid |
Optional[str] |
[AWS] Instance role | No |
instance_type |
Optional[str] |
Instance type used in cluster | No |
bodo_version |
Optional[str] |
Bodo version available on the cluster | No |
auto_az |
Optional[bool] |
[AWS] Whether the cluster is Auto AZ enabled | No |
availability_zone |
Optional[str] |
Availability zone for the cluster | No |
Returns:
Object ClusterResponse:
Field | Type | Description |
---|---|---|
name |
str |
Cluster name |
uuid |
Union[str, UUID] |
Cluster UUID |
status |
ClusterStatus |
Cluster status |
description |
Optional[str] |
Cluster description |
instance_type |
str |
Instance type of cluster nodes |
workers_quantity |
int |
Number workers in clusters |
auto_stop |
Optional[int] |
Cluster auto stop value [min] |
auto_pause |
Optional[int] |
Cluster auto pause value [min] |
bodo_version |
Optional[str] |
Bodo version used in cluster VMs |
image_id |
Optional[str] |
Image id used cluster VMs |
cores_per_worker |
Optional[int] |
Number of cores per worker |
accelerated_networking |
Optional[bool] |
Whether the cluster uses accelerated networking |
created_at |
str |
Date when cluster was created |
last_known_activity |
Optional[str] |
Date of last known cluster activity |
is_job_dedicated |
Optional[bool] |
whether the cluster is a job dedicated cluster |
node_metadata |
Optional[object] |
Metada data about nodes (IP, instance id) |
asg_metadata |
Optional[object] |
[AWS] Auto scaling group metadata |
aws_deployment_subnet_id |
Optional[str] |
[AWS] Subnet id for cluster |
auto_az |
Optional[bool] |
[AWS] Whether the cluster has Auto AZ enabled |
Example:
from bodosdk.models import WorkspaceKeys, ModifyCluster, ClusterResponse
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
role_definition = CreateRoleDefinition(
name="test-sdk-role-creation",
description="testing-instance-role-creation",
data=InstanceRole(role_arn="arn:aws:iam::427443013497:role/testing_bucket_with_my_script")
)
result_create_role: CreateRoleResponse = client.instance_role.create(role_definition)
client = get_bodo_client(keys)
modify_cluster = ModifyCluster(
uuid="<cluster-uuid>",
auto_pause=60,
auto_stop=0,
workers_quantity=4,
description="using the SDK",
instance_type="c5.large",
instance_role_uuid=result_create_role.uuid,
bodo_version="2022.4",
auto_az=True,
)
partial_modify_cluster = ModifyCluster(
uuid="<cluster-uuid>",
autopause=120,
)
new_cluster: List[ClusterResponse] = client.cluster.modify(modify_cluster)
new_cluster_partial: List[ClusterResponse] = client.cluster.modify(partial_modify_cluster)
Example: Detach Custom Instance Role . Replace the custom instance role with default role which is automatically created for a cluster
detach_custom_instance_role = ModifyCluster(
uuid="<cluster-uuid>",
instance_role_uuid="default",
)
new_cluster_partial: ClusterResponse = client.cluster.modify(detach_custom_instance_role)
List tasks for a cluster
BodoClient.cluster.list_tasks(uuid: Union[str, UUID]) -> List[TaskInfo]
Gets all taks for cluster.
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Cluster UUID | Yes |
Returns: List[TaskInfo]
Object TaskInfo:
Field | Type | Description |
---|---|---|
uuid |
str |
Cluster task UUID |
status |
TaskStatus |
Cluster task status |
task_type |
str |
Task type |
logs |
str |
Cluster description |
Example:
from bodosdk.models import WorkspaceKeys, TaskInfo
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
tasks: List[TaskInfo] = client.cluster.list_tasks(uuid)
Workspace resource
Module responsible for managing workspaces in an organization.
- workspace getting started
- create workspace
- list workspaces
- get workspace
- remove workspace
- assign user
- list workspace tasks
Workspace getting started
In order to work with Workspace, users need to generate Personal Tokens, under Admin Console, from the Bodo Platform Dashboard.
Then instantiate a PersonalKeys object with the generated client_id
and secret_id
. Then pass in this personal key while
instantiating a client object
from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
client_id='<CLIENT-ID>',
secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
Create Workspace
BodoClient.workspace.create(workspace_definition: WorkspaceDefinition) -> WorkspaceCreatedResponse
Creates a workspace with the specifications passed in through a WorkspaceDefinition object under the
user's organization
`BodoClient.workspace.create(workspace_definition: WorkspaceDefinition) -> WorkspaceCreatedResponse`
Creates a workspace with the specifications passed in through a `WorkspaceDefinition` object under the
user's organization.
**Parameter:** `workspace_definition: WorkspaceDefinition`
| Field | Type | Description | Required |
| ------------------- | -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- |
| `name` | `str` | Workspace name | Yes |
| `region` | `str` | Region where workspace should be placed | Yes |
| `cloud_config_uuid` | `Union[UUID, str]` | Existing AWS or Azure cloud config uuid | Yes |
| `storage_endpoint` | `Optional[bool]` | Enable additonal endpoints [Microsoft.Storage](https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-service-endpoints-overview) or [S3 Gateway](https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-s3.html). <br/>There's no additional charge for using service endpoints. | No |
| `aws_network_data` | `Optional[AWSNetworkData]` | Specific for network data for AWS | No |
**Returns:**
Object `WorkspaceCreatedResponse:`
| Field | Type | Description |
| ------------------- | --------------- | ----------------------------------------- |
| `name` | `str` | Workspace name |
| `uuid` | `str` | Workspace UUID |
| `status` | `str` | Workspace status |
| `region` | `str` | Region where workspace should be placed. |
| `organization_uuid` | `str` | Organization UUID |
| `data` | `Optional[Any]` | Specific configuration data for workspace |
| `created_by` | `Optional[str]` | Email address of the workspace creator |
| `cloud_config` | `Optional[Any]` | Cloud config data |
**Example: Create new Workspace**
```python
from bodosdk.models import PersonalKeys
from bodosdk.models import WorkspaceDefinition
personal_keys = PersonalKeys(
client_id='<CLIENT-ID>',
secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
wd = WorkspaceDefinition(
name="<WORSPACE-NAME>",
cloud_config_uuid="<CONFIG-UUID>",
region="<WORKSPACE-REGION>"
)
resp = client.workspace.create(wd)
List Workspaces
BodoClient.workspace.list()
Returns:
Returns a list of all workspaces defined under this organization as List[WorkspaceListItem]
Field | Type | Description |
---|---|---|
name |
str |
Workspace name |
uuid |
str |
Workspace UUID |
status |
WorkspaceStatus |
Workspace status |
provider |
str |
Workspace provider (AWS or AZURE) |
region |
str |
Region where workspace should be placed. |
organization_uuid |
str |
Organization UUID |
data |
Optional[Any] |
Specific configuration data for workspace |
server_time |
Optional[str] |
Current server time |
cloud_config |
Optional[Any] |
Cloud config data |
created_by |
Optional[str] |
Email address of the workspace creator |
Example:
`BodoClient.workspace.list()`
Returns a list of all workspaces defined under this organization. The with_task boolean controls printing out
tasks running in the workspaces. The returned list is a list of GetWorkspaceResponse object
from bodosdk.models import PersonalKeys personal_keys = PersonalKeys( client_id='', secret_id='', ) client = get_bodo_organization_client(personal_keys) resp = client.workspace.list()
### Get Workspace<a id="get-workspace"></a>
`BodoClient.workspace.get(uuid: str)`
Returns information about the workspace with the given uuid. Returns a GetWorkspaceResponse object with details about the workspace uuid mentioned.
BodoClient.workspace.get(uuid: Union[str, UUID]) -> WorkspaceResponse
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Workspace UUID | Yes |
Returns:
Returns the workspace details provided the workspace uuid
- WorkspaceResponse
Field | Type | Description |
---|---|---|
name |
str |
Workspace name |
uuid |
str |
Workspace UUID |
status |
WorkspaceStatus |
Workspace status |
region |
str |
Region where workspace is placed. |
organization_uuid |
str |
Organization UUID |
data |
Optional[Any] |
Specific configuration data for workspace |
cloud_config |
Optional[Any] |
Cloud config data |
created_by |
Optional[str] |
Email address of the workspace creator |
Example:
from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
client_id='<CLIENT-ID>',
secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
resp = client.workspace.get("<WORKSPACE-UUID>")
Remove Workspace
BodoClient.workspace.remove(uuid: Union[str, UUID], mark_as_terminated: bool = False)
Removes the workspace with the provided uuid. The operation is only successful if all resources within the workspaces(jobs, clusters, notebooks) are already terminated.
Parameters:
Field | Type | Description | Required |
---|---|---|---|
uuid |
Union[str, UUID] |
Workspace UUID | Yes |
mark_as_terminated | bool |
Mark role as terminated without removing resources, may be useful if role creation failed and deletion is failing |
No |
Returns: Returns None if successful. Otherwise, raises exception.
Example:
from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
client_id='<CLIENT-ID>',
secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
resp = client.workspace.remove("<WORKSPACE-UUID>")
Assign user
BodoClient.workspace.assign_users(workspace_uuid: Union[str, UUID], users: List[UserAssignment])
Assign user to workspace.
Parameters:
Parameter | Type | Description | Required |
---|---|---|---|
workspace_uuid |
Union[str, UUID] |
Workspace UUID | Yes |
users |
List[UserAssignment] |
List of users that will be assigned to a given workspace | Yes |
Returns: Returns None if successful. Otherwise, raises exception.
Example:
from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
client_id='<CLIENT-ID>',
secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
workspace_uuid = "<some uuid>"
users: List[UserAssignment] = [
UserAssignment(
email="example@example.com",
skip_email=True,
bodo_role=BodoRole.ADMIN
)
]
client.workspace.assign_users(workspace_uuid, users)
List Workspace tasks
BodoClient.workspace.get_tasks(workspace_uuid: Union[str, UUID]) -> List[TaskInfo]
Returns:
Return a list of workspace tasks in the workspace - List[TaskInfo]
Field | Type | Description |
---|---|---|
uuid |
str |
Workspace task uuid |
status |
TaskStatus |
Status of workspace task |
task_type |
WorkspaceStatus |
Type of workspace task |
logs |
str |
Logs from specific task |
Example
from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
client_id='<CLIENT-ID>',
secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
resp = client.workspace.get_tasks("<WORKSPACE-UUID>")
Cloud Config
Module responsible for creating cloud configurations for organization.
Create config
BodoClient.cloud_config.create(config: Union[CreateAwsCloudConfig, CreateAzureCloudConfig])
Create cloud configuration for cloud
AWS example
from bodosdk.models import OrganizationKeys, CreateAwsProviderData, CreateAwsCloudConfig, AwsCloudConfig
from bodosdk.client import get_bodo_client
keys = OrganizationKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
config = CreateAwsCloudConfig(
name='test',
aws_provider_data=CreateAwsProviderData(
tf_backend_region='us-west-1',
access_key_id='xyz',
secret_access_key='xyz'
)
)
config: AwsCloudConfig = client.cloud_config.create(config)
Azure example
from bodosdk.models import OrganizationKeys, CreateAzureProviderData, CreateAzureCloudConfig, AzureCloudConfig
from bodosdk.client import get_bodo_client
keys = OrganizationKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
config = CreateAzureCloudConfig(
name='test',
azure_provider_data=CreateAzureProviderData(
tf_backend_region='eastus',
tenant_id='xyz',
subscription_id='xyz',
resource_group='MyResourceGroup'
)
)
config: AzureCloudConfig = client.cloud_config.create(config)
Get config
BodoClient.cloud_config.list()
Get list of cloud configs.
from bodosdk.models import OrganizationKeys, AzureCloudConfig, AwsCloudConfig
from bodosdk.client import get_bodo_client
from typing import Union, List
keys = OrganizationKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
configs: List[Union[AwsCloudConfig, AzureCloudConfig]] = client.cloud_config.list()
Get config
BodoClient.cloud_config.get(uuid: Union[str, UUID])
Get cloud config by uuid.
from bodosdk.models import OrganizationKeys, AzureCloudConfig, AwsCloudConfig
from bodosdk.client import get_bodo_client
from typing import Union
keys = OrganizationKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
config: Union[AwsCloudConfig, AzureCloudConfig] = client.cloud_config.get('8c32aec5-7181-45cc-9e17-8aff35fd269e')
Instance Role Manager
Module responsible for managing AWS roles in workspace.
Create role
BodoClient.instance_role.create()
Creates an AWS role with the specified role definition with a given AWS role arn.
from bodosdk.models import WorkspaceKeys, CreateRoleDefinition, CreateRoleResponse
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
role_definition = CreateRoleDefinition(
name="test-sdk-role-creation",
description="testing",
data=InstanceRole(role_arn="arn:aws:iam::1234567890:role/testing")
)
result_create:CreateRoleResponse = client.instance_role.create(role_definition)
List roles
BodoClient.instance_role.list()
Returns list of all roles in workspace
from bodosdk.models import WorkspaceKeys, InstanceRoleItem
from bodosdk.client import get_bodo_client
from typing import List
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
result_list:List[InstanceRoleItem] = client.instance_role.list()
Get role
BodoClient.instance_role.get(cluster_uuid)
Returns role by uuid
from bodosdk.models import WorkspaceKeys, InstanceRoleItem
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
clusters: InstanceRoleItem = client.instance_role.get('<CLUSTER-UUID>')
Remove role
BodoClient.instance_role.remove(cluster_uuid, mark_as_terminated=False)
Parameters | Type | Description | Required |
---|---|---|---|
mark_as_terminated | Boolean | Mark role as terminated without removing resources, may be useful if role creation failed and deletion is failing |
No |
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
client.instance_role.remove('<ROLE-UUID>')
Catalog
Module responsible for storing database catalogs
- create catalog
- get catalog by uuid
- get catalog by name
- list catalogs
- update catalog
- remove catalog
- remove all catalogs
Create Catalog
BodoClient.catalog.create()
Stores the Database Catalog
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.catalog import CatalogDefinition, SnowflakeConnectionDefinition
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
# Type Support for Snowflake
snowflake_definition = SnowflakeConnectionDefinition(
host="test.snowflake.com",
port=443,
username="test-username",
password="password",
database="test-db",
warehouse="test-wh",
role="test-role"
)
# For other databases, need to defined as JSON
connection_data = {
"host": "test.db.com",
"username": "test-username",
"password": "*****",
"database": "test-db",
}
catalog_definition = CatalogDefinition(
name="catalog-1",
description="catalog description",
catalogType="SNOWFLAKE", # Currently Support Snowflake
data=snowflake_definition
)
client.catalog.create(catalog_definition)
Get Catalog by UUID
BodoClient.catalog.get_catalog()
Retrieves the Catalog details by UUID
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.catalog import CatalogInfo
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
catalog_info: CatalogInfo = client.catalog.get("<CATALOG-UUID>")
Get Catalog by Name
BodoClient.catalog.get_by_name()
Retrieves the Catalog details by UUID
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.catalog import CatalogInfo
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
catalog_info: CatalogInfo = client.catalog.get_by_name("test-catalog")
List Catalogs
BodoClient.catalog.list()
Retrieves all catalogs in a workspace.
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.catalog import CatalogInfo
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
catalog_info: CatalogInfo = client.catalog.list()
Update Catalog
BodoClient.catalog.update()
Updates the Database Catalog
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.catalog import CatalogDefinition, SnowflakeConnectionDefinition
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
# Type Support for Snowflake
snowflake_definition = SnowflakeConnectionDefinition(
host="update.snowflake.com",
port=443,
username="test-username",
password="password",
database="test-db",
warehouse="test-wh",
role="test-role"
)
new_catalog_def = CatalogDefinition(
name="catalog-1",
description="catalog description",
catalogType="SNOWFLAKE", # Currently Support Snowflake
data=snowflake_definition
)
client.catalog.update("<CATALOG-UUID>", new_catalog_def)
Remove Catalog by UUID
BodoClient.catalog.remove()
Deletes a Database Catalog by UUID
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
client.catalog.remove("<CATALOG-UUID>")
Remove all Catalogs
BodoClient.catalog.remove()
Deletes a Database Catalog by UUID
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
client.catalog.remove_all()
Secret Groups
Module responsible for separating secrets into multiple groups.
A default secret group will be created at the time of workspace creation. Users can define custom secret groups using the following functions.
Create Secret Group
BodoClient.secret_group.create()
Create a secret group
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secret_group import SecretGroupDefinition
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
secret_group_definition = SecretGroupDefinition(
name="sg-1", # Name should be unique to that workspace
description="secret group description",
)
client.secret_group.create(secret_group_definition)
List Secret Groups
BodoClient.secret_group.list()
List all the secret groups in a workspace.
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secret_group import SecretGroupInfo
from typing import List
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
groups_list: List[SecretGroupInfo] = client.secret_group.list()
Update Secret Group
BodoClient.secret_group.update()
Updates the secret group description
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secret_group import SecretGroupInfo, SecretGroupDefinition
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
update_secret_group_def = SecretGroupDefinition(
name="sg-1", # Cannot modify the name in the group
description="secret group description",
)
groups_data: SecretGroupInfo = client.secret_group.update(update_secret_group_def)
Delete Secret Group
BodoClient.secret_group.remove()
Removes the secret group.
Note: Can only remove if all the secrets in the group are deleted
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
client.secret_group.remove("<secret-group-uuid>")
Secrets
Module responsible for creating secrets.
Create Secret
BodoClient.secrets.create()
Create the secret in a secret group.
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretDefinition
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
secret_definition = SecretDefinition(
name="secret-1",
data={
"key": "value"
},
secret_group="<secret-group-name>" #If not defined, defaults to default to secret group
)
client.secrets.create(secret_definition)
Get Secrets by UUID
BodoClient.secrets.get()
Retrieves the Secrets by UUID
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretInfo
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
secret_info: SecretInfo = client.secrets.get("<secret-uuid>")
List Secrets by Workspace
BodoClient.secrets.list()
List the secrets in a workspace
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretInfo
from typing import List
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
secrets_info: List[SecretInfo] = client.secrets.list()
List Secrets by Secret Group
BodoClient.secrets.list_by_group()
List the Secrets by Secret Group
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretInfo
from typing import List
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
secrets_info: List[SecretInfo] = client.secrets.list_by_group("<secret-group-name>")
Update Secret
BodoClient.secrets.update()
Updates the secret.
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretDefinition
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
update_secret_def = SecretDefinition(
data={
"key": "value"
}
)
client.secrets.update("<secret-uuid>", update_secret_def)
Delete Secrets by UUID
BodoClient.secrets.remove()
Delete the Secret by UUID
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretInfo
keys = WorkspaceKeys(
client_id='XYZ',
secret_key='XYZ'
)
client = get_bodo_client(keys)
secret_info: SecretInfo = client.secrets.remove("<secret-uuid>")
Billing module
Billing module provides access to billing information related to a particular workspace.
Get job run billing report CSV
BodoClient.billing.get_job_run_price_export(started_at: str, finished_at: str, workspace_uuid: Union[str, UUID])
Provides a CSV download link for the billing report, specifically on the job run level, displaying EC2 costs per job run
within the defined startTime and endTime range for all workspaces.
For a particular workspace, provide the workspaceUUID which can be obtained from list-workspaces
The billing report includes essential fields such as start time, end time, duration, worker count, instance type, and associated costs. This link remains active for a duration of 7 days and is exclusively available to AWS customers. It's important to note that reports can only be generated for a 30-day time period.
Parameters:
Parameters | Type | Description | Required |
---|---|---|---|
started_at |
Union[str, date] |
Start date of the report | Yes |
finished_at |
Union[str, date] |
End date of the report | Yes |
workspace_uuid |
Union[str, UUID] |
Workspace UUID, returns all workspaces in organization when null | No |
Returns:
Fields | Type | Description |
---|---|---|
url | string | S3 Pre-signed URL which contains the report |
- CSV report for date range
The following Python code generates a CSV report for a specified date range, starting from September 5th to
September 6th for a given workspace UUID. .
You can also provide the same date range with a timestamp in the ISO8601 format, such as
2023-09-05T11:15:00Z
. After running the code, it will display a link to the CSV report that can be clicked to
initiate the download:
from bodosdk.models import OrganizationKeys
from bodosdk.client import get_bodo_organization_client
keys = OrganizationKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_organization_client(keys)
print(client.billing.get_job_run_price_export('2023-09-05', '2023-09-06', 'WORKSPACE-UUID'))
Get cluster level billing report CSV
BodoClient.billing.get_cluster_price_export(started_at: str, finished_at: str, workspace_uuid: Union[str, UUID])
Provides a CSV download link for the billing report, specifically at the cluster level, displaying EC2 costs per cluster run
within the defined startTime and endTime range among all workspaces.
For a particular workspace, provide the workspaceUUID which can be obtained from list-workspaces
The billing report includes essential fields such as start time, end time, duration, worker count, instance type, and associated costs. This link remains active for a duration of 7 days and is exclusively available to AWS customers. It's important to note that reports can only be generated for a 30-day time period.
Parameters:
Parameters | Type | Description | Required |
---|---|---|---|
started_at |
Union[str, date] |
Start date of the report | Yes |
finished_at |
Union[str, date] |
End date of the report | Yes |
workspace_uuid |
Union[str, UUID] |
Workspace UUID, returns all workspaces in organization when null |
No |
Returns:
Fields | Type | Description |
---|---|---|
url | string | S3 Pre-signed URL which contains the report |
- CSV report for date range
The following Python code generates a CSV report for a specified date range, starting from September 5th to
September 6th for a given workspace UUID.
You can also provide the same date range with a timestamp in the ISO8601 format, such as 2023-09-05T11:15:00Z
.
After running the code, it will display a link to the CSV report that can be clicked to initiate the download:
from bodosdk.models import OrganizationKeys
from bodosdk.client import get_bodo_organization_client
keys = OrganizationKeys(
client_id="XYZ",
secret_key="XYZ"
)
client = get_bodo_organization_client(keys)
print(client.billing.get_cluster_price_export('2023-09-05', '2023-09-06', 'WORKSPACE-UUID'))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.