Python SDK for HPC-AI cloud GPU fine-tuning
Project description
HPC-AI Python SDK
Overview
The HPC-AI Python SDK provides a powerful interface for distributed GPU training and fine-tuning on HPC-AI's cloud infrastructure.
Installation
we recommend using conda to install the SDK.
conda create -n hpcai python=3.12 -y
conda activate hpcai
git clone https://github.com/hpcaitech/HPC-AI-SDK
cd HPC-AI-SDK
pip install .
We only support installing from source currently, we will release official PIP package soon.
Quick Start
from hpcai import ServiceClient, TrainingClient
# Initialize the service client
client = ServiceClient(
base_url="https://hpc-ai.com",
api_key="your-api-key"
)
# Create a training client for LoRA fine-tuning
training_client = client.create_lora_training_client(
base_model="Qwen/Qwen2.5-7B",
rank=8,
seed=42
)
Migration Guide
If you previously used another training SDK, update your code:
pip install .
Update your imports:
from hpcai import ServiceClient
Path Protocol
The SDK uses the hpcai:// protocol for model and checkpoint paths:
model_path = "hpcai://run-123/weights/checkpoint-001"
Note: Legacy path protocols are supported during the migration period for backward compatibility.
Environment Variables
Configure the SDK using these environment variables:
HPCAI_API_KEY- Your API keyHPCAI_BASE_URL- API endpoint (default: https://hpc-ai.com)HPCAI_TELEMETRY- Enable/disable telemetry (default: enabled)
Legacy environment variable names are supported for backward compatibility.
Features
- Distributed Training: Leverage HPC-AI's GPU cloud for efficient model training
- LoRA Fine-tuning: Memory-efficient fine-tuning with LoRA adapters
- Async Support: Full async/await support for concurrent operations
- Type Safety: Comprehensive type hints for better IDE support
Usage Example
A usage example for finetune "Qwen3-8B" model.
Documentation
API Reference
- ServiceClient API Reference - Main entry point for creating clients and querying server capabilities
- TrainingClient API Reference - Training operations including forward/backward passes and optimization
- RestClient API Reference - REST API operations for querying training runs and checkpoints
Third-Party Notice
This SDK provides interoperability with components based on the Tinker project (Apache License 2.0). Tinker is a trademark of its respective owner. This project is not affiliated with or endorsed by Thinking Machines Lab.
License
Licensed under the Apache License, Version 2.0. See LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hpcai-0.1.6.tar.gz.
File metadata
- Download URL: hpcai-0.1.6.tar.gz
- Upload date:
- Size: 286.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ffa538dd7cad889700fa0b4d2e2866f3624edde40cae7bc642822f78f2667e4
|
|
| MD5 |
4cbf538eaa53a51ff03c3998f18d6002
|
|
| BLAKE2b-256 |
49e6ff8cf3852d3fe9ff242f0aead866690adf18877d055a40d9d7e78c2a50fe
|
File details
Details for the file hpcai-0.1.6-py3-none-any.whl.
File metadata
- Download URL: hpcai-0.1.6-py3-none-any.whl
- Upload date:
- Size: 183.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7f11e73e1ffac63ed8ca15be484210d5e7d3bcabbb6fc32204fe225c0aa09a0
|
|
| MD5 |
fe022993a7177dd5ae8a786cc10c8f0c
|
|
| BLAKE2b-256 |
ff113bcd8f6b1a81bcca20b0558ec0bd85ded56027d5f1037383cbbf5c7c08d2
|