Skip to main content

Python SDK for HPC-AI cloud GPU fine-tuning

Project description

HPC-AI Python SDK

Overview

The HPC-AI Python SDK provides a powerful interface for distributed GPU training and fine-tuning on HPC-AI's cloud infrastructure.

Installation

we recommend using conda to install the SDK.

conda create -n hpcai python=3.12 -y
conda activate hpcai
git clone https://github.com/hpcaitech/HPC-AI-SDK
cd HPC-AI-SDK
pip install .

We only support installing from source currently, we will release official PIP package soon.

Quick Start

from hpcai import ServiceClient, TrainingClient

# Initialize the service client
client = ServiceClient(
    base_url="https://hpc-ai.com",
    api_key="your-api-key"
)

# Create a training client for LoRA fine-tuning
training_client = client.create_lora_training_client(
    base_model="Qwen/Qwen2.5-7B",
    rank=8,
    seed=42
)

Migration Guide

If you previously used another training SDK, update your code:

pip install .

Update your imports:

from hpcai import ServiceClient

Path Protocol

The SDK uses the hpcai:// protocol for model and checkpoint paths:

model_path = "hpcai://run-123/weights/checkpoint-001"

Note: Legacy path protocols are supported during the migration period for backward compatibility.

Environment Variables

Configure the SDK using these environment variables:

  • HPCAI_API_KEY - Your API key
  • HPCAI_BASE_URL - API endpoint (default: https://hpc-ai.com)
  • HPCAI_TELEMETRY - Enable/disable telemetry (default: enabled)

Legacy environment variable names are supported for backward compatibility.

Features

  • Distributed Training: Leverage HPC-AI's GPU cloud for efficient model training
  • LoRA Fine-tuning: Memory-efficient fine-tuning with LoRA adapters
  • Async Support: Full async/await support for concurrent operations
  • Type Safety: Comprehensive type hints for better IDE support

Usage Example

A usage example for finetune "Qwen3-8B" model.

Documentation

API Reference

Third-Party Notice

This SDK provides interoperability with components based on the Tinker project (Apache License 2.0). Tinker is a trademark of its respective owner. This project is not affiliated with or endorsed by Thinking Machines Lab.

License

Licensed under the Apache License, Version 2.0. See LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hpcai-0.1.6.tar.gz (286.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hpcai-0.1.6-py3-none-any.whl (183.5 kB view details)

Uploaded Python 3

File details

Details for the file hpcai-0.1.6.tar.gz.

File metadata

  • Download URL: hpcai-0.1.6.tar.gz
  • Upload date:
  • Size: 286.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for hpcai-0.1.6.tar.gz
Algorithm Hash digest
SHA256 4ffa538dd7cad889700fa0b4d2e2866f3624edde40cae7bc642822f78f2667e4
MD5 4cbf538eaa53a51ff03c3998f18d6002
BLAKE2b-256 49e6ff8cf3852d3fe9ff242f0aead866690adf18877d055a40d9d7e78c2a50fe

See more details on using hashes here.

File details

Details for the file hpcai-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: hpcai-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 183.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for hpcai-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 d7f11e73e1ffac63ed8ca15be484210d5e7d3bcabbb6fc32204fe225c0aa09a0
MD5 fe022993a7177dd5ae8a786cc10c8f0c
BLAKE2b-256 ff113bcd8f6b1a81bcca20b0558ec0bd85ded56027d5f1037383cbbf5c7c08d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page