sageLLM backend provider abstraction (CPU/CUDA/Ascend)
Project description
sagellm-backend
Protocol Compliance (Mandatory)
- MUST follow Protocol v0.1: https://github.com/intellistream/sagellm-docs/blob/main/docs/specs/protocol_v0.1.md
- Any globally shared definitions (fields, error codes, metrics, IDs, schemas) MUST be added to Protocol first.
硬件抽象层 - 为 sageLLM 提供统一的硬件接口(CUDA/Ascend/Kunlun)
架构定位
┌─────────────────────────────────────────────────────────────┐
│ sagellm-core (引擎协调层) │
│ • BaseEngine, EngineFactory │
│ • CPUEngine, HFCudaEngine │
├─────────────────────────────────────────────────────────────┤
│ sagellm-backend (硬件抽象层) ← 本仓库 │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ BackendProvider Interface │ │
│ │ • Stream/Event 异步流 │ │
│ │ • KVBlock 内存管理 │ │
│ │ • Collective 操作(all_reduce/all_gather) │ │
│ └─────────────────────────────────────────────────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ CUDA │ │ Ascend │ │ Kunlun │ │
│ │ Provider │ │ Provider │ │ Provider │ │
│ └──────────┘ └──────────┘ └──────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Hardware SDK Layer │
│ CUDA/cuDNN/NCCL │ CANN/HCCL │ XPU SDK │ DCU SDK │
└─────────────────────────────────────────────────────────────┘
职责分离(v0.2.0 重构):
- ✅ 本仓库负责:硬件抽象、设备管理、内存原语
- ❌ 不再包含:BaseEngine, EngineFactory(已移至 sagellm-core)
- 🔗 被使用于:sagellm-core 中的引擎实现
Features
- 统一硬件抽象:单一 API 支持多硬件后端
- CPU Backend:无 GPU 环境的默认后端
- CUDA Support:原生 CUDA 后端实现
- CPU Support:CPU-only 后端实现
- 能力发现:硬件能力查询与验证
Installation
pip install isagellm-backend
Quick Start
git clone git@github.com:intellistream/sagellm-backend.git
cd sagellm-backend
./quickstart.sh
# Run tests
pytest tests/ -v
Usage Examples
Basic Backend Usage
from sagellm_backend import CPUBackendProvider, DType
# Create backend
backend = CPUBackendProvider()
# Query capabilities
cap = backend.capability()
print(cap.supported_dtypes)
# Allocate KV block
block = backend.kv_block_alloc(128, DType.FP16)
Using with sagellm-core Engines
注意:引擎实现(如 HFCudaEngine)已移至 sagellm-core。Backend 现在专注于硬件抽象。
# 引擎现在位于 sagellm-core
from sagellm_core import HFCudaEngine, EngineFactory
from sagellm_backend import CPUBackendProvider
# 创建 backend
backend = CPUBackendProvider()
# 使用 EngineFactory(来自 core)
engine = EngineFactory.create(
backend_type="cpu",
config={
"engine_id": "cpu-001",
"model_path": "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
},
provider=backend,
)
Extending with New Backends
# Create provider in providers/ directory
class AscendBackendProvider:
def capability(self) -> CapabilityDescriptor:
return CapabilityDescriptor(
supported_dtypes=[DType.FP16, DType.BF16, DType.INT8],
# ...
)
# Implement other interface methods...
# Register via entry point in pyproject.toml
[project.entry-points."sagellm.backends"]
ascend_cann = "sagellm_backend.providers.ascend:create_ascend_backend"
Documentation
License
Proprietary
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
isagellm_backend-0.3.0.10.tar.gz
(97.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isagellm_backend-0.3.0.10.tar.gz.
File metadata
- Download URL: isagellm_backend-0.3.0.10.tar.gz
- Upload date:
- Size: 97.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eeac150c1d6918f918d038c76b3242142963000364dd15254f0f170817c0d89f
|
|
| MD5 |
8fe8daea46d3ae353bd6a8b8ac560970
|
|
| BLAKE2b-256 |
9316068b631df9f807c471d99b4752d0fee4d30fe6e9825f5c61622085e67368
|
File details
Details for the file isagellm_backend-0.3.0.10-py2.py3-none-any.whl.
File metadata
- Download URL: isagellm_backend-0.3.0.10-py2.py3-none-any.whl
- Upload date:
- Size: 126.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3ea777c9e53e7dc340596c7b990d0cb65cb120890062687493fafcb0270ff4f
|
|
| MD5 |
ab94e2b6f6ee40905f8fa3e55ac8ec3a
|
|
| BLAKE2b-256 |
ca478590d1f34e63845fd3d6a86f8417972ec3c30c1ebc1a07b7b91983238c25
|