Skip to main content

MLOps Python SDK for XCloud Service API

Project description

SDK

Software Development Kits for integrating with the XCloud Service API.

[!NOTE] SDK Support SDKs provide type-safe, high-level interfaces for interacting with the platform API. They handle authentication, error handling, and request retries automatically.

Installation

The Python SDK installation.

pip install mlops-python-sdk

Configuration

The SDK reads configuration from environment variables by default:

  • MLOPS_API_KEY: API key (required)
  • MLOPS_DOMAIN: API domain, e.g. localhost:8090 or https://example.com
  • MLOPS_API_PATH: API path prefix (default: /api/v1)
  • MLOPS_DEBUG: true|false (default: false)

Or configure in code:

from mlops import ConnectionConfig, Task

config = ConnectionConfig(
    api_key="xck_...",
    domain="https://example.com",
    api_path="/api/v1",
    debug=False,
)
task = Task(config=config)

SDK Usage

Initialize client

from mlops import Task

task = Task()  # uses environment variables by default

Submit a GPU task

from mlops import Task

task = Task()
resp = task.submit(
    name="gpu-task-from-sdk",
    cluster_name="slurm-cn",
    team_id=1,
    image="/mnt/minio/images/01ai-registry.cn-shanghai.cr.aliyuncs.com+public+llamafactory+0.9.3.sqsh",
    entry_command="llamafactory-cli train /workspace/config/test_lora.yaml",
    resources={
        "partition": "gpu",
        "nodes": 2,
        "ntasks": 2,
        "cpus_per_task": 2,
        "memory": "4G",
        "time": "01:00:00",
        "gres": "gpu:nvidia_a10:1",
        "qos": "qos_xcloud",
    },
    file_path="/path/to/xservice.zip",  # optional: .zip/.tar.gz/.tgz
)
print(resp.job_id)

Submit a CPU task

from mlops import Task

task = Task()
resp = task.submit(
    name="cpu-task-from-sdk",
    cluster_name="slurm-cn",
    team_id=1,
    image="docker://01ai-registry.cn-shanghai.cr.aliyuncs.com/01-ai/xcs/v2/alpine:3.23.0",
    entry_command="echo hello",
    resources={
        "partition": "cpu",
        "nodes": 1,
        "ntasks": 1,
        "cpus_per_task": 1,
        "memory": "1G",
        "time": "01:00:00",
        "qos": "qos_xcloud",
    },
)
print(resp.job_id)

List tasks

from mlops import Task
from mlops.api.client.models.task_status import TaskStatus

task = Task()
resp = task.list(status=TaskStatus.COMPLETED, cluster_name="slurm-cn", page=1, page_size=20)
print(len(resp.tasks or []))

Get task details

from mlops import Task

task = Task()
task_info = task.get(task_id=12345, cluster_name="slurm-cn")
print(task_info)

Cancel a task

from mlops import Task

task = Task()
task.cancel(task_id=12345, cluster_name="slurm-cn")

Delete a task

from mlops import Task

task = Task()
task.delete(task_id=12345, cluster_name="slurm-cn")

Task Management Methods:

  • submit() - Submit a new task with container image and entry command
  • get() - Get task details by task ID
  • list() - List tasks with optional filters (status, cluster_name, team_id, user_id)
  • cancel() - Cancel a running task
  • delete() - Delete a task record

Task Status Values:

from mlops.api.client.models.task_status import TaskStatus

TaskStatus.PENDING      # Task is pending
TaskStatus.QUEUED       # Task is queued
TaskStatus.RUNNING      # Task is running
TaskStatus.COMPLETED    # Task completed successfully
TaskStatus.SUCCEEDED    # Task succeeded
TaskStatus.FAILED       # Task failed
TaskStatus.CANCELLED    # Task was cancelled
TaskStatus.CREATED      # Task was created

Error Handling:

from mlops.exceptions import (
    APIException,
    AuthenticationException,
    NotFoundException,
    RateLimitException,
    TimeoutException,
    InvalidArgumentException,
    NotEnoughSpaceException
)
from mlops import Task

task = Task()

try:
    result = task.submit(
        name="test",
        cluster_name="slurm-cn",
        image="docker://alpine:3.23.0",
        entry_command="echo hello",
    )
except AuthenticationException as e:
    print(f"Authentication failed: {e}")
except NotFoundException as e:
    print(f"Resource not found: {e}")
except APIException as e:
    print(f"API error: {e}")

[!TIP] Error Handling SDKs automatically parse typed responses and raise structured exceptions.

Features

  • Type-safe API clients
  • Automatic authentication
  • Error handling
  • Typed response parsing (generated models)
  • Unexpected-status guard (optional)

Resources

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlops_python_sdk-1.0.4.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlops_python_sdk-1.0.4-py3-none-any.whl (59.6 kB view details)

Uploaded Python 3

File details

Details for the file mlops_python_sdk-1.0.4.tar.gz.

File metadata

  • Download URL: mlops_python_sdk-1.0.4.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.8 Darwin/23.2.0

File hashes

Hashes for mlops_python_sdk-1.0.4.tar.gz
Algorithm Hash digest
SHA256 97c1d0d626a5a53df30e497605fe756649512cac6736a673feb59d2714bc2a52
MD5 06e3e7316636d0c5444e3e7d9b8a22b8
BLAKE2b-256 cd468c810d7675be89bfce94c3eeede625e41fa190bf2e20aa242736cb57be89

See more details on using hashes here.

File details

Details for the file mlops_python_sdk-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: mlops_python_sdk-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 59.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.8 Darwin/23.2.0

File hashes

Hashes for mlops_python_sdk-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 79b69b54d389b45a6b5ab9ca848b62194fc27ee8a83222838804415032e41085
MD5 3cd28e46fb33dacc670aba41436d9e2d
BLAKE2b-256 40ab00d93d13954af397e3838f2f560f9c8c584ecec250f8eb10146ef13daf28

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page