sageLLM backend provider abstraction (CPU/CUDA/Ascend)
Project description
sagellm-backend
Protocol Compliance (Mandatory)
- MUST follow Protocol v0.1: https://github.com/intellistream/sagellm-docs/blob/main/docs/specs/protocol_v0.1.md
- Any globally shared definitions (fields, error codes, metrics, IDs, schemas) MUST be added to Protocol first.
硬件抽象层 - 为 sageLLM 提供统一的硬件接口(CUDA/Ascend/Kunlun)
架构定位
┌─────────────────────────────────────────────────────────────┐
│ sagellm-core (引擎协调层) │
│ • LLMEngine (硬件无关的统一引擎) │
│ • 自动选择最佳后端 (cuda > ascend > cpu) │
├─────────────────────────────────────────────────────────────┤
│ sagellm-backend (硬件抽象层) ← 本仓库 │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ BackendProvider Interface │ │
│ │ • Stream/Event 异步流 │ │
│ │ • KVBlock 内存管理 │ │
│ │ • Collective 操作(all_reduce/all_gather) │ │
│ └─────────────────────────────────────────────────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ CUDA │ │ Ascend │ │ Kunlun │ │
│ │ Provider │ │ Provider │ │ Provider │ │
│ └──────────┘ └──────────┘ └──────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Hardware SDK Layer │
│ CUDA/cuDNN/NCCL │ CANN/HCCL │ XPU SDK │ DCU SDK │
└─────────────────────────────────────────────────────────────┘
职责分离(v0.2.0 重构):
- ✅ 本仓库负责:硬件抽象、设备管理、内存原语
- ❌ 不再包含:BaseEngine, EngineFactory(已移至 sagellm-core)
- 🔗 被使用于:sagellm-core 中的引擎实现
Features
- 统一硬件抽象:单一 API 支持多硬件后端
- CPU Backend:无 GPU 环境的默认后端
- CUDA Support:原生 CUDA 后端实现
- CPU Support:CPU-only 后端实现
- 能力发现:硬件能力查询与验证
Installation
pip install isagellm-backend
Quick Start
git clone git@github.com:intellistream/sagellm-backend.git
cd sagellm-backend
./quickstart.sh
# Run tests
pytest tests/ -v
Usage Examples
Basic Backend Usage
from sagellm_backend import CPUBackendProvider, DType
# Create backend
backend = CPUBackendProvider()
# Query capabilities
cap = backend.capability()
print(cap.supported_dtypes)
# Allocate KV block
block = backend.kv_block_alloc(128, DType.FP16)
Using with sagellm-core LLMEngine
Backend 现在专注于硬件抽象,引擎使用 sagellm-core 的 LLMEngine。
# LLMEngine 位于 sagellm-core
from sagellm_core import LLMEngine, LLMEngineConfig
# LLMEngine 自动选择最佳后端
config = LLMEngineConfig(
model_path="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
backend_type="auto", # 自动选择: cuda > ascend > cpu
max_new_tokens=100,
)
engine = LLMEngine(config)
await engine.start()
# 推理
output = await engine.generate("Hello, world!")
print(output)
await engine.stop()
Extending with New Backends
# Create provider in providers/ directory
class AscendBackendProvider:
def capability(self) -> CapabilityDescriptor:
return CapabilityDescriptor(
supported_dtypes=[DType.FP16, DType.BF16, DType.INT8],
# ...
)
# Implement other interface methods...
# Register via entry point in pyproject.toml
[project.entry-points."sagellm.backends"]
ascend_cann = "sagellm_backend.providers.ascend:create_ascend_backend"
Documentation
License
Proprietary
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
isagellm_backend-0.4.0.6.tar.gz
(128.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isagellm_backend-0.4.0.6.tar.gz.
File metadata
- Download URL: isagellm_backend-0.4.0.6.tar.gz
- Upload date:
- Size: 128.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c370e4a23d7b28d0454c78420eba9592d726659c6556a12303f0aa37ec66fe27
|
|
| MD5 |
d35b891fe7eff4c67ee7be8c91cf8acc
|
|
| BLAKE2b-256 |
ca76d20e1198536b4782e18f72be6d120181aaec3603360013185a9f2d73472a
|
File details
Details for the file isagellm_backend-0.4.0.6-py2.py3-none-any.whl.
File metadata
- Download URL: isagellm_backend-0.4.0.6-py2.py3-none-any.whl
- Upload date:
- Size: 188.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b05911bdbf2583bc5c3744d78e6e97b1e73f0b12b732a76faa51c9dcc5e21ba
|
|
| MD5 |
72cd602a97ad3f2cb8b171a5bbec4a8f
|
|
| BLAKE2b-256 |
3bf78ea70f06fa6c4ce205aeef969484a1e13b82b8d9634b7506caacbe2ee2f7
|