Python SDK for HPC-AI cloud GPU fine-tuning
Project description
HPC-AI Python SDK
Overview
The HPC-AI Python SDK provides a powerful interface for distributed GPU training and fine-tuning on HPC-AI's cloud infrastructure.
Installation
we recommend using conda to install the SDK.
conda create -n hpcai python=3.12 -y
conda activate hpcai
pip install hpcai
Quick Start
from hpcai import ServiceClient, TrainingClient
# Initialize the service client
client = ServiceClient(
base_url="https://www.hpc-ai.com/finetunesdk",
api_key="your-api-key"
)
# Create a training client for LoRA fine-tuning
training_client = client.create_lora_training_client(
base_model="Qwen/Qwen2.5-7B",
rank=8,
seed=42
)
Path Protocol
The SDK uses the hpcai:// protocol for model and checkpoint paths:
model_path = "hpcai://run-123/training/checkpoint-001"
Environment Variables
Configure the SDK using these environment variables:
HPCAI_API_KEY- Your API keyHPCAI_BASE_URL- API endpoint (default: https://www.hpc-ai.com/finetunesdk)
Features
- Distributed Training: Leverage HPC-AI's GPU cloud for efficient model training
- LoRA Fine-tuning: Memory-efficient fine-tuning with LoRA adapters
- Async Support: Full async/await support for concurrent operations
- Type Safety: Comprehensive type hints for better IDE support
Usage Example
A usage example for finetune "Qwen3-8B" model.
Cookbook
We provide a cookbook for you to use the SDK to train your models. Code can be found here.
Clone the repo to check more detail usage about the cookbook.
git clone https://github.com/hpcaitech/HPC-AI-SDK
cd HPC-AI-SDK/src/hpcai/cookbook
Documentation
API Reference
- ServiceClient API Reference - Main entry point for creating clients and querying server capabilities
- TrainingClient API Reference - Training operations including forward/backward passes and optimization
- RestClient API Reference - REST API operations for querying training runs and checkpoints
Development
This repository uses pre-commit for basic formatting and hygiene checks.
pip install -r requirements-dev.txt
pre-commit install
pre-commit run -a
Third-Party Notice
This SDK provides interoperability with components based on the Tinker project (Apache License 2.0). Tinker is a trademark of its respective owner. This project is not affiliated with or endorsed by Thinking Machines Lab.
License
Licensed under the Apache License, Version 2.0. See LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hpcai-0.2.0.tar.gz.
File metadata
- Download URL: hpcai-0.2.0.tar.gz
- Upload date:
- Size: 216.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b87e993a2761243a6d4a110e7e4d7a26d852fe21896f6244d58c11d102fe1c7a
|
|
| MD5 |
02edbc51cb8e97be891ccc52cce7f3a3
|
|
| BLAKE2b-256 |
bdaaa0b0b106e0c69cfdf0ebc620e8a413dd1d770277727a24f22e8d3954d5fd
|
File details
Details for the file hpcai-0.2.0-py3-none-any.whl.
File metadata
- Download URL: hpcai-0.2.0-py3-none-any.whl
- Upload date:
- Size: 284.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
969c39d471fe2c143214ee3789d520868519026ed13d4ab192442a975be2da47
|
|
| MD5 |
1327c17c7f7961681c47164746227263
|
|
| BLAKE2b-256 |
dc361ec64fcd3fa8c43b8af73d67ff88483e6c11be333747ed8090214e6bcfea
|