sageLLM backend provider abstraction (CPU/CUDA/Ascend)
Project description
sagellm-backend
Protocol Compliance (Mandatory)
- MUST follow Protocol v0.1: https://github.com/intellistream/sagellm-docs/blob/main/docs/specs/protocol_v0.1.md
- Any globally shared definitions (fields, error codes, metrics, IDs, schemas) MUST be added to Protocol first.
硬件抽象层 - 为 sageLLM 提供统一的硬件接口(CUDA/Ascend/Kunlun)
架构定位
┌─────────────────────────────────────────────────────────────┐
│ sagellm-core (引擎协调层) │
│ • BaseEngine, EngineFactory │
│ • CPUEngine, HFCudaEngine │
├─────────────────────────────────────────────────────────────┤
│ sagellm-backend (硬件抽象层) ← 本仓库 │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ BackendProvider Interface │ │
│ │ • Stream/Event 异步流 │ │
│ │ • KVBlock 内存管理 │ │
│ │ • Collective 操作(all_reduce/all_gather) │ │
│ └─────────────────────────────────────────────────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ CUDA │ │ Ascend │ │ Kunlun │ │
│ │ Provider │ │ Provider │ │ Provider │ │
│ └──────────┘ └──────────┘ └──────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Hardware SDK Layer │
│ CUDA/cuDNN/NCCL │ CANN/HCCL │ XPU SDK │ DCU SDK │
└─────────────────────────────────────────────────────────────┘
职责分离(v0.2.0 重构):
- ✅ 本仓库负责:硬件抽象、设备管理、内存原语
- ❌ 不再包含:BaseEngine, EngineFactory(已移至 sagellm-core)
- 🔗 被使用于:sagellm-core 中的引擎实现
Features
- 统一硬件抽象:单一 API 支持多硬件后端
- CPU Backend:无 GPU 环境的默认后端
- CUDA Support:原生 CUDA 后端实现
- CPU Support:CPU-only 后端实现
- 能力发现:硬件能力查询与验证
Installation
pip install isagellm-backend
Quick Start
git clone git@github.com:intellistream/sagellm-backend.git
cd sagellm-backend
./quickstart.sh
# Run tests
pytest tests/ -v
Usage Examples
Basic Backend Usage
from sagellm_backend import CPUBackendProvider, DType
# Create backend
backend = CPUBackendProvider()
# Query capabilities
cap = backend.capability()
print(cap.supported_dtypes)
# Allocate KV block
block = backend.kv_block_alloc(128, DType.FP16)
Using with sagellm-core Engines
注意:引擎实现(如 HFCudaEngine)已移至 sagellm-core。Backend 现在专注于硬件抽象。
# 引擎现在位于 sagellm-core
from sagellm_core import HFCudaEngine, EngineFactory
from sagellm_backend import CPUBackendProvider
# 创建 backend
backend = CPUBackendProvider()
# 使用 EngineFactory(来自 core)
engine = EngineFactory.create(
backend_type="cpu",
config={
"engine_id": "cpu-001",
"model_path": "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
},
provider=backend,
)
Extending with New Backends
# Create provider in providers/ directory
class AscendBackendProvider:
def capability(self) -> CapabilityDescriptor:
return CapabilityDescriptor(
supported_dtypes=[DType.FP16, DType.BF16, DType.INT8],
# ...
)
# Implement other interface methods...
# Register via entry point in pyproject.toml
[project.entry-points."sagellm.backends"]
ascend_cann = "sagellm_backend.providers.ascend:create_ascend_backend"
Documentation
License
Proprietary
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
isagellm_backend-0.3.0.4.tar.gz
(96.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isagellm_backend-0.3.0.4.tar.gz.
File metadata
- Download URL: isagellm_backend-0.3.0.4.tar.gz
- Upload date:
- Size: 96.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7b9823234c85dc54d939ab65932f85f661c3b4690f6b606b9d7910300586b42
|
|
| MD5 |
3bdaf6da02b61e0726acf8a6f5046b9c
|
|
| BLAKE2b-256 |
c2e751e082aa5415dd2fc9e35471dbddcc31b9007db81bbe7b3ba6e02a632ccd
|
File details
Details for the file isagellm_backend-0.3.0.4-py2.py3-none-any.whl.
File metadata
- Download URL: isagellm_backend-0.3.0.4-py2.py3-none-any.whl
- Upload date:
- Size: 126.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
266da034c94dd76e5d00409a92ef7c2cbf00e329803534b8811b6f07b2a822c2
|
|
| MD5 |
792b397c6111f705fc0f7bf6ccfa0457
|
|
| BLAKE2b-256 |
561cdd26f6db0630182c01b7d23b0e995ec4420c528c9ca8280b9f7e7acae221
|