Skip to main content

MLOps Python SDK for XCloud Service API

Project description

SDK

Software Development Kits for integrating with the XCloud Service API.

[!NOTE] SDK Support SDKs provide type-safe, high-level interfaces for interacting with the platform API. They handle authentication, error handling, and request retries automatically.

Installation

The Python SDK installation.

pip install mlops-python-sdk

Configuration

The SDK reads configuration from environment variables by default:

  • MLOPS_API_KEY: API key (required)
  • MLOPS_DOMAIN: API domain, e.g. localhost:8090 or https://example.com
  • MLOPS_API_PATH: API path prefix (default: /api/v1)
  • MLOPS_DEBUG: true|false (default: false)

Or configure in code:

from mlops import ConnectionConfig, Task

config = ConnectionConfig(
    api_key="xck_...",
    domain="https://example.com",
    api_path="/api/v1",
    debug=False,
)
task = Task(config=config)

SDK Usage

Initialize client

from mlops import Task

task = Task()  # uses environment variables by default

Submit a GPU task

from mlops import Task

task = Task()
resp = task.submit(
    name="gpu-task-from-sdk",
    cluster_name="slurm-cn",
    team_id=1,
    image="/mnt/minio/images/01ai-registry.cn-shanghai.cr.aliyuncs.com+public+llamafactory+0.9.3.sqsh",
    entry_command="llamafactory-cli train /workspace/config/test_lora.yaml",
    resources={
        "partition": "gpu",
        "nodes": 2,
        "ntasks": 2,
        "cpus_per_task": 2,
        "memory": "4G",
        "time": "01:00:00",
        "gres": "gpu:nvidia_a10:1",
        "qos": "qos_xcloud",
    },
    file_path="/path/to/xservice.zip",  # optional: .zip/.tar.gz/.tgz
)
print(resp.job_id)

Submit a CPU task

from mlops import Task

task = Task()
resp = task.submit(
    name="cpu-task-from-sdk",
    cluster_name="slurm-cn",
    team_id=1,
    image="docker://01ai-registry.cn-shanghai.cr.aliyuncs.com/01-ai/xcs/v2/alpine:3.23.0",
    entry_command="echo hello",
    resources={
        "partition": "cpu",
        "nodes": 1,
        "ntasks": 1,
        "cpus_per_task": 1,
        "memory": "1G",
        "time": "01:00:00",
        "qos": "qos_xcloud",
    },
)
print(resp.job_id)

List tasks

from mlops import Task
from mlops.api.client.models.task_status import TaskStatus

task = Task()
resp = task.list(status=TaskStatus.COMPLETED, cluster_name="slurm-cn", page=1, page_size=20)
print(len(resp.tasks or []))

Get task details

from mlops import Task

task = Task()
task_info = task.get(task_id=12345, cluster_name="slurm-cn")
print(task_info)

Cancel a task

from mlops import Task

task = Task()
task.cancel(task_id=12345, cluster_name="slurm-cn")

Delete a task

from mlops import Task

task = Task()
task.delete(task_id=12345, cluster_name="slurm-cn")

Task Management Methods:

  • submit() - Submit a new task with container image and entry command
  • get() - Get task details by task ID
  • list() - List tasks with optional filters (status, cluster_name, team_id, user_id)
  • cancel() - Cancel a running task
  • delete() - Delete a task record

Task Status Values:

from mlops.api.client.models.task_status import TaskStatus

TaskStatus.PENDING      # Task is pending
TaskStatus.QUEUED       # Task is queued
TaskStatus.RUNNING      # Task is running
TaskStatus.COMPLETED    # Task completed successfully
TaskStatus.SUCCEEDED    # Task succeeded
TaskStatus.FAILED       # Task failed
TaskStatus.CANCELLED    # Task was cancelled
TaskStatus.CREATED      # Task was created

Error Handling:

from mlops.exceptions import (
    APIException,
    AuthenticationException,
    NotFoundException,
    RateLimitException,
    TimeoutException,
    InvalidArgumentException,
    NotEnoughSpaceException
)
from mlops import Task

task = Task()

try:
    result = task.submit(
        name="test",
        cluster_name="slurm-cn",
        image="docker://alpine:3.23.0",
        entry_command="echo hello",
    )
except AuthenticationException as e:
    print(f"Authentication failed: {e}")
except NotFoundException as e:
    print(f"Resource not found: {e}")
except APIException as e:
    print(f"API error: {e}")

[!TIP] Error Handling SDKs automatically parse typed responses and raise structured exceptions.

Features

  • Type-safe API clients
  • Automatic authentication
  • Error handling
  • Typed response parsing (generated models)
  • Unexpected-status guard (optional)

Resources

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlops_python_sdk-1.0.3.tar.gz (34.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlops_python_sdk-1.0.3-py3-none-any.whl (58.9 kB view details)

Uploaded Python 3

File details

Details for the file mlops_python_sdk-1.0.3.tar.gz.

File metadata

  • Download URL: mlops_python_sdk-1.0.3.tar.gz
  • Upload date:
  • Size: 34.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.8 Darwin/23.2.0

File hashes

Hashes for mlops_python_sdk-1.0.3.tar.gz
Algorithm Hash digest
SHA256 66f66f87dca932ad20779297f185b0abcd92e76d5166cee547d43d90ab57d84d
MD5 7321a39809f44fe527e91a0390f5de46
BLAKE2b-256 a5f44dce0ce03d8b7e4fcc7385615ec0ef03298ea544054f4a6c8b5e786b4e02

See more details on using hashes here.

File details

Details for the file mlops_python_sdk-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: mlops_python_sdk-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 58.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.8 Darwin/23.2.0

File hashes

Hashes for mlops_python_sdk-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f6c0d0147a5368ede0da25165b77d3daa4647c37b6094ac2abff7ce23ffa42db
MD5 7a0fbb1e32cdedf619ed83d615f10005
BLAKE2b-256 6ae98606605266af5654070eff02d5b148906a118a125541e5ac4635ee4e7e17

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page