MLOps Python SDK for XCloud Service API
Project description
SDK
Software Development Kits for integrating with the XCloud Service API.
[!NOTE] SDK Support SDKs provide type-safe, high-level interfaces for interacting with the platform API. They handle authentication, error handling, and request retries automatically.
Available SDKs
Python SDK
Installation
The Python SDK installation.
pip install mlops-python-sdk
Configuration
The SDK reads configuration from environment variables by default:
MLOPS_API_KEY: API key (required)MLOPS_DOMAIN: API domain, e.g.localhost:8090orhttps://example.comMLOPS_API_PATH: API path prefix (default:/api/v1)MLOPS_DEBUG:true|false(default:false)
Or configure in code:
from mlops import ConnectionConfig, Task
config = ConnectionConfig(
api_key="xck_...",
domain="https://example.com",
api_path="/api/v1",
debug=False,
)
task = Task(config=config)
Usage
from mlops import Task
from mlops.api.client.models.task_status import TaskStatus
from pathlib import Path
# Initialize Task client (uses environment variables by default)
task = Task()
# Submit a task with gpu type
try:
result = task.submit(
name="gpu-task-from-sdk",
image="/mnt/minio/images/01ai-registry.cn-shanghai.cr.aliyuncs.com+public+llamafactory+0.9.3.sqsh",
entry_command="llamafactory-cli train /workspace/config/test_lora.yaml",
resources={
"partition": "gpu",
"nodes": 2,
"ntasks": 2,
"cpus_per_task": 2,
"memory": "4G",
"time": "01:00:00",
"gres": "gpu:nvidia_a10:1",
"qos": "qos_xcloud",
},
cluster_name="slurm-cn",
team_id=1,
file_path="your file path", # optional, support for .zip, .tar.gz, .tgz
)
if result is not None:
print("==== gpu task submitted successfully ====")
job_id = result.job_id
else:
print("==== gpu task submitted failed ====")
except Exception as e:
print("==== gpu task submitted failed error ====", e)
# Submit a task with cpu type
try:
entry_content = Path("entry.sh").read_text(encoding="utf-8")
result = task.submit(
name="cpu-task-from-sdk",
image="docker://01ai-registry.cn-shanghai.cr.aliyuncs.com/01-ai/xcs/v2/alpine:3.23.0",
entry_command=entry_content,
resources={
"partition": "cpu",
"nodes": 1,
"ntasks": 1,
"cpus_per_task": 1,
"memory": "1G",
"time": "01:00:00",
"qos": "qos_xcloud",
},
cluster_name="slurm-cn",
team_id=1,
)
if result is not None:
print("==== cpu task submitted successfully ====")
job_id = result.job_id
else:
print("==== cpu task submitted failed ====")
except Exception as e:
print("==== cpu task submitted failed error ====", e)
# List tasks with filters
try:
completed_tasks = task.list(
status=TaskStatus.COMPLETED,
cluster_name="slurm-cn",
page=1,
page_size=20
)
# Get task details
if completed_tasks is not None and len(completed_tasks.tasks) > 0:
print("==== completed_tasks number ====", len(completed_tasks.tasks))
task_info = task.get(task_id=completed_tasks.tasks[0].job_id, cluster_name="slurm-cn")
print("==== task_info ====", task_info)
else:
print("==== no completed tasks to get details ====")
except Exception as e:
print("==== get task details failed error ====", e)
# Cancel a running task
try:
running_tasks = task.list(
status=TaskStatus.RUNNING,
cluster_name="slurm-cn",
page=1,
page_size=20
)
if running_tasks is not None and len(running_tasks.tasks) > 0:
print("==== running_tasks number ====", len(running_tasks.tasks))
# Cancel a task
result = task.cancel(task_id=running_tasks.tasks[0].job_id, cluster_name="slurm-cn")
print("==== task cancelled ====", running_tasks.tasks[0].job_id, result)
else:
print("==== no running tasks to cancel ====")
except Exception as e:
print("==== cancel running task failed error ====", e)
# Delete a task
try:
completed_tasks = task.list(
status=TaskStatus.COMPLETED,
cluster_name="slurm-cn",
page=1,
page_size=20
)
if completed_tasks is not None and len(completed_tasks.tasks) > 0:
print("==== completed_tasks number ====", len(completed_tasks.tasks))
# Delete a task
result = task.delete(task_id=completed_tasks.tasks[0].job_id, cluster_name="slurm-cn")
print("==== task deleted ====", completed_tasks.tasks[0].job_id, result)
else:
print("==== no completed tasks to delete ====")
except Exception as e:
print("==== delete completed task failed error ====", e)
Task Management Methods:
submit()- Submit a new task with container image and entry commandget()- Get task details by task IDlist()- List tasks with optional filters (status, cluster_name, team_id, user_id)cancel()- Cancel a running taskdelete()- Delete a task record
Task Status Values:
from mlops.api.client.models.task_status import TaskStatus
TaskStatus.PENDING # Task is pending
TaskStatus.QUEUED # Task is queued
TaskStatus.RUNNING # Task is running
TaskStatus.COMPLETED # Task completed successfully
TaskStatus.SUCCEEDED # Task succeeded
TaskStatus.FAILED # Task failed
TaskStatus.CANCELLED # Task was cancelled
TaskStatus.CREATED # Task was created
Error Handling:
from mlops.exceptions import (
APIException,
AuthenticationException,
NotFoundException,
RateLimitException,
TimeoutException,
InvalidArgumentException,
NotEnoughSpaceException
)
try:
result = task.submit(name="test", cluster_name="slurm-cn", command="echo hello")
except AuthenticationException as e:
print(f"Authentication failed: {e}")
except NotFoundException as e:
print(f"Resource not found: {e}")
except APIException as e:
print(f"API error: {e}")
[!TIP] Error Handling SDKs automatically handle common errors and retry failed requests. Check SDK documentation for error handling best practices.
Features
- Type-safe API clients
- Automatic authentication
- Error handling
- Request retry logic
- Response validation
Resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlops_python_sdk-1.0.2.tar.gz.
File metadata
- Download URL: mlops_python_sdk-1.0.2.tar.gz
- Upload date:
- Size: 33.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.8 Darwin/23.2.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab9aaa7c036492edce240434987150b33a34c11f99c8de0053efe70eb022bda5
|
|
| MD5 |
ae99a52334e53404f5490d8a18ba4c16
|
|
| BLAKE2b-256 |
4d6c07a4f5024af0aa0a7beffaecf18bc8c5a170da609df914e280836bc208b1
|
File details
Details for the file mlops_python_sdk-1.0.2-py3-none-any.whl.
File metadata
- Download URL: mlops_python_sdk-1.0.2-py3-none-any.whl
- Upload date:
- Size: 58.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.8 Darwin/23.2.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
042c12d78faaada47fc2cf0172f717d7bb4f5adc210cd119a1a5733f8295d6e7
|
|
| MD5 |
c1ecfb8211b6ed836a8159757f50982c
|
|
| BLAKE2b-256 |
b8ee1712f6243de4245b81f5102f45d10380fbd6b6ecc517c21677e60d8f29ea
|