Skip to main content

MLOps Python SDK for XCloud Service API

Project description

SDK

Software Development Kits for integrating with the XCloud Service API.

[!NOTE] SDK Support SDKs provide type-safe, high-level interfaces for interacting with the platform API. They handle authentication, error handling, and request retries automatically.

Installation

The Python SDK installation.

pip install mlops-python-sdk

Configuration

The SDK reads configuration from environment variables by default:

  • MLOPS_API_KEY: API key (required)
  • MLOPS_DOMAIN: API domain, e.g. localhost:8090 or https://example.com
  • MLOPS_API_PATH: API path prefix (default: /api/v1)
  • MLOPS_DEBUG: true|false (default: false)

Or configure in code:

from mlops import ConnectionConfig, Task

config = ConnectionConfig(
    api_key="xck_...",
    domain="https://example.com",
    api_path="/api/v1",
    debug=False,
)
task = Task(config=config)

SDK Usage

Initialize client

from mlops import Task

task = Task()  # uses environment variables by default

Submit a GPU task

from mlops import Task

task = Task()
resp = task.submit(
    name="gpu-task-from-sdk",
    cluster_name="slurm-cn",
    team_id=1,
    image="/mnt/minio/images/01ai-registry.cn-shanghai.cr.aliyuncs.com+public+llamafactory+0.9.3.sqsh",
    entry_command="llamafactory-cli train /workspace/config/test_lora.yaml",
    resources={
        "partition": "gpu",
        "nodes": 2,
        "ntasks": 2,
        "cpus_per_task": 2,
        "memory": "4G",
        "time": "01:00:00",
        "gres": "gpu:nvidia_a10:1",
        "qos": "qos_xcloud",
    },
    file_path="/path/to/xservice.zip",  # optional: .zip/.tar.gz/.tgz
)
print(resp.job_id)

Submit a CPU task

from mlops import Task

task = Task()
resp = task.submit(
    name="cpu-task-from-sdk",
    cluster_name="slurm-cn",
    team_id=1,
    image="docker://01ai-registry.cn-shanghai.cr.aliyuncs.com/01-ai/xcs/v2/alpine:3.23.0",
    entry_command="echo hello",
    resources={
        "partition": "cpu",
        "nodes": 1,
        "ntasks": 1,
        "cpus_per_task": 1,
        "memory": "1G",
        "time": "01:00:00",
        "qos": "qos_xcloud",
    },
)
print(resp.job_id)

List tasks

from mlops import Task
from mlops.api.client.models.task_status import TaskStatus

task = Task()
resp = task.list(status=TaskStatus.COMPLETED, cluster_name="slurm-cn", page=1, page_size=20)
print(len(resp.tasks or []))

Get task details

from mlops import Task

task = Task()
task_info = task.get(task_id=12345, cluster_name="slurm-cn")
print(task_info)

Cancel a task

from mlops import Task

task = Task()
task.cancel(task_id=12345, cluster_name="slurm-cn")

Delete a task

from mlops import Task

task = Task()
task.delete(task_id=12345, cluster_name="slurm-cn")

Task Management Methods:

  • submit() - Submit a new task with container image and entry command
  • get() - Get task details by task ID
  • list() - List tasks with optional filters (status, cluster_name, team_id, user_id)
  • cancel() - Cancel a running task
  • delete() - Delete a task record

Task Status Values:

from mlops.api.client.models.task_status import TaskStatus

TaskStatus.PENDING      # Task is pending
TaskStatus.QUEUED       # Task is queued
TaskStatus.RUNNING      # Task is running
TaskStatus.COMPLETED    # Task completed successfully
TaskStatus.SUCCEEDED    # Task succeeded
TaskStatus.FAILED       # Task failed
TaskStatus.CANCELLED    # Task was cancelled
TaskStatus.CREATED      # Task was created

Error Handling:

from mlops.exceptions import (
    APIException,
    AuthenticationException,
    NotFoundException,
    RateLimitException,
    TimeoutException,
    InvalidArgumentException,
    NotEnoughSpaceException
)
from mlops import Task

task = Task()

try:
    result = task.submit(
        name="test",
        cluster_name="slurm-cn",
        image="docker://alpine:3.23.0",
        entry_command="echo hello",
    )
except AuthenticationException as e:
    print(f"Authentication failed: {e}")
except NotFoundException as e:
    print(f"Resource not found: {e}")
except APIException as e:
    print(f"API error: {e}")

[!TIP] Error Handling SDKs automatically parse typed responses and raise structured exceptions.

Features

  • Type-safe API clients
  • Automatic authentication
  • Error handling
  • Typed response parsing (generated models)
  • Unexpected-status guard (optional)

Resources

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlops_python_sdk-1.0.6.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlops_python_sdk-1.0.6-py3-none-any.whl (59.6 kB view details)

Uploaded Python 3

File details

Details for the file mlops_python_sdk-1.0.6.tar.gz.

File metadata

  • Download URL: mlops_python_sdk-1.0.6.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.8 Darwin/23.2.0

File hashes

Hashes for mlops_python_sdk-1.0.6.tar.gz
Algorithm Hash digest
SHA256 3fb052f0595cea59b2259319b175ddc3d4f5b81bdf8270ac39716b38456e1b67
MD5 3a5866f91fc356d7cbae16f45edf7d47
BLAKE2b-256 e806f8bd224fd13b85f0c2f618089930abe95bd4fe45c4013a2a8ba77081446b

See more details on using hashes here.

File details

Details for the file mlops_python_sdk-1.0.6-py3-none-any.whl.

File metadata

  • Download URL: mlops_python_sdk-1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 59.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.8 Darwin/23.2.0

File hashes

Hashes for mlops_python_sdk-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 956f321852d98906deec528970aae09245cce285e1bb1d53456f1745ced3a74a
MD5 31112fd8fe92e469503ae0a3d2974dde
BLAKE2b-256 1966a96d06777b0288bd022b0462605e6aff2eb109dd47ee2c668881bfa1ffe2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page