sageLLM backend provider abstraction (CPU/CUDA/Ascend)
Project description
sagellm-backend
Protocol Compliance (Mandatory)
- MUST follow Protocol v0.1: https://github.com/intellistream/sagellm-docs/blob/main/docs/specs/protocol_v0.1.md
- Any globally shared definitions (fields, error codes, metrics, IDs, schemas) MUST be added to Protocol first.
硬件抽象层 - 为 sageLLM 提供统一的硬件接口(CUDA/Ascend/Kunlun)
架构定位
┌─────────────────────────────────────────────────────────────┐
│ sagellm-core (引擎协调层) │
│ • BaseEngine, EngineFactory │
│ • CPUEngine, HFCudaEngine │
├─────────────────────────────────────────────────────────────┤
│ sagellm-backend (硬件抽象层) ← 本仓库 │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ BackendProvider Interface │ │
│ │ • Stream/Event 异步流 │ │
│ │ • KVBlock 内存管理 │ │
│ │ • Collective 操作(all_reduce/all_gather) │ │
│ └─────────────────────────────────────────────────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ CUDA │ │ Ascend │ │ Kunlun │ │
│ │ Provider │ │ Provider │ │ Provider │ │
│ └──────────┘ └──────────┘ └──────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Hardware SDK Layer │
│ CUDA/cuDNN/NCCL │ CANN/HCCL │ XPU SDK │ DCU SDK │
└─────────────────────────────────────────────────────────────┘
职责分离(v0.2.0 重构):
- ✅ 本仓库负责:硬件抽象、设备管理、内存原语
- ❌ 不再包含:BaseEngine, EngineFactory(已移至 sagellm-core)
- 🔗 被使用于:sagellm-core 中的引擎实现
Features
- 统一硬件抽象:单一 API 支持多硬件后端
- CPU Backend:无 GPU 环境的默认后端
- CUDA Support:原生 CUDA 后端实现
- CPU Support:CPU-only 后端实现
- 能力发现:硬件能力查询与验证
Installation
pip install isagellm-backend
Quick Start
git clone git@github.com:intellistream/sagellm-backend.git
cd sagellm-backend
./quickstart.sh
# Run tests
pytest tests/ -v
Usage Examples
Basic Backend Usage
from sagellm_backend import CPUBackendProvider, DType
# Create backend
backend = CPUBackendProvider()
# Query capabilities
cap = backend.capability()
print(cap.supported_dtypes)
# Allocate KV block
block = backend.kv_block_alloc(128, DType.FP16)
Using with sagellm-core Engines
注意:引擎实现(如 HFCudaEngine)已移至 sagellm-core。Backend 现在专注于硬件抽象。
# 引擎现在位于 sagellm-core
from sagellm_core import HFCudaEngine, EngineFactory
from sagellm_backend import CPUBackendProvider
# 创建 backend
backend = CPUBackendProvider()
# 使用 EngineFactory(来自 core)
engine = EngineFactory.create(
backend_type="cpu",
config={
"engine_id": "cpu-001",
"model_path": "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
},
provider=backend,
)
Extending with New Backends
# Create provider in providers/ directory
class AscendBackendProvider:
def capability(self) -> CapabilityDescriptor:
return CapabilityDescriptor(
supported_dtypes=[DType.FP16, DType.BF16, DType.INT8],
# ...
)
# Implement other interface methods...
# Register via entry point in pyproject.toml
[project.entry-points."sagellm.backends"]
ascend_cann = "sagellm_backend.providers.ascend:create_ascend_backend"
Documentation
License
Proprietary
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
isagellm_backend-0.3.0.8.tar.gz
(96.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isagellm_backend-0.3.0.8.tar.gz.
File metadata
- Download URL: isagellm_backend-0.3.0.8.tar.gz
- Upload date:
- Size: 96.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92e9b8b216c324e7477e76b8add72c98cd0d1d8e740cef833733daadbe90396c
|
|
| MD5 |
1f275a780f4d37c8a20b6786f3ef15e0
|
|
| BLAKE2b-256 |
9aef11abe2fee054b561af619afd1785ecd7573ddfc9e5e731b7f2728209c72f
|
File details
Details for the file isagellm_backend-0.3.0.8-py2.py3-none-any.whl.
File metadata
- Download URL: isagellm_backend-0.3.0.8-py2.py3-none-any.whl
- Upload date:
- Size: 126.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d3b0182b0eeea1dc08b58098f37bd7f2aaba3177cfa2eb2b92aad9dbf900e60
|
|
| MD5 |
ffe6eff9eca47914cd92a974ab5f2365
|
|
| BLAKE2b-256 |
ee3566ca7390a728b14818540a9e7fae66e2dcf286c127f8581d71260e103acc
|