sageLLM Control Plane - Intelligent request routing, scheduling, and engine lifecycle management
Project description
sageLLM Control Plane
Protocol Compliance (Mandatory)
- MUST follow Protocol v0.1: Protocol v0.1
- Any globally shared definitions (fields, error codes, metrics, IDs, schemas) MUST be added to Protocol first.
Intelligent request routing, scheduling, and engine lifecycle management for sageLLM.
职责定位
Control Plane 位于用户/Gateway 与执行引擎之间,负责:
- 注册与健康管理引擎
- 请求路由与调度(含 PD 分离场景)
- 负载均衡与基础扩缩容钩子
依赖关系
PyPI 包名:isagellm-control-plane | 导入命名空间:sagellm_control
依赖(以 pyproject.toml 为准):
- isagellm-protocol>=0.4.0.0,<0.5.0
- pydantic>=2.0.0
- httpx>=0.24.0
可选(用于本地直接执行引擎):
- isagellm-core(提供
LLMEngine)
被以下组件使用(示例):
- sagellm(统一入口/CLI)
- sagellm-gateway(API 网关)
安装指南
pip install isagellm-control-plane
Requirements: Python 3.11+
快速开始(CPU-first,可运行)
import asyncio
from sagellm_control import ControlPlaneManager, EngineState, ExecutionInstanceType
from sagellm_protocol import Request
async def main() -> None:
cp = ControlPlaneManager(scheduling_policy="fifo", routing_strategy="least_loaded", mode="local")
cp.register_engine(
engine_id="engine-001",
model_id="Qwen2-7B",
host="localhost",
port=8001,
engine_kind="llm",
metadata={"instance_type": ExecutionInstanceType.GENERAL.value},
)
cp.update_engine_state("engine-001", EngineState.READY)
req = Request(
request_id="req-001",
trace_id="trace-001",
model="Qwen2-7B",
prompt="Hello",
max_tokens=16,
stream=False,
)
decision = await cp.schedule_request(
request_id=req.request_id,
trace_id=req.trace_id,
model_id=req.model,
prompt=req.prompt,
max_tokens=req.max_tokens,
)
print(decision)
cp.unregister_engine("engine-001")
if __name__ == "__main__":
asyncio.run(main())
完整演示见 examples/mvp_integration_demo.py。
Scheduler IR 模块使用说明
Control Plane 内置 Scheduler IR(Intermediate Representation)模块,用于把请求调度过程表达为可优化图结构:
IRBuilder:把请求构建为SchedulerIR(Task/Prefill/Decode 节点 + 依赖边)IROptimizer:执行可插拔优化 Pass(如KVReusePass、ComputeCommOverlapPass)DefaultIRExecutor:把 IR 翻译为执行命令并通过 Control Plane/Engine Client 执行
可直接从根包导入:
from sagellm_control import IRBuilder, IROptimizer, DefaultIRExecutor
示例程序(至少 3 个):
- 基础构建:examples/ir_basic_example.py
- 优化流程:examples/ir_optimization_example.py
- KV-aware 场景:examples/ir_kv_aware_example.py
运行示例:
python examples/ir_basic_example.py
python examples/ir_optimization_example.py
python examples/ir_kv_aware_example.py
API 文档(核心接口)
ControlPlaneManagerregister_engine()/unregister_engine()/list_engines()schedule_request()execute_request()/stream_request()get_embeddings()
EngineClient(HTTP 调用执行引擎)LocalEngineClient(本地直接调用引擎)Scheduler IR:SchedulerIR,IRBuilder,IROptimizer,DefaultIRExecutorSchedulingPolicy及内置策略(FIFOPolicy,PriorityPolicy,SLOAwarePolicy,AdaptivePolicy,KVAwareSchedulingPolicy)- 关键类型:
EngineInfo,EngineState,SchedulingDecision,RequestPriority,RequestType
架构图示
flowchart LR
A[Gateway/Client] --> B[ControlPlaneManager]
B --> C[SchedulingPolicy]
B --> D[RequestRouter/LoadBalancer]
B --> E[EngineLifecycleManager]
D --> F[EngineClient (HTTP)]
B --> G[LocalEngineClient]
F --> H[Execution Engines]
G --> H
代码结构
sagellm_control/
├── types.py # Core data types (EngineInfo, SchedulingDecision, etc.)
├── policies/ # Scheduling policies (FIFO, Priority, SLO-aware, Adaptive)
├── router.py # Request routing and load balancing
├── lifecycle.py # Engine lifecycle management
├── scaling.py # Scaling manager (MVP hooks)
├── engine_client.py # HTTP client to engines
├── local_engine_client.py # Local (in-process) engine client
├── ir/ # Scheduler IR (types/builder/optimizer/executor)
└── manager.py # ControlPlaneManager
开发指南
git clone git@github.com:intellistream/sagellm-control-plane.git
cd sagellm-control-plane
./quickstart.sh --dev
Quickstart 模式说明:
--standard:依赖优先从 PyPI 安装,当前仓库本地 editable 安装(稳定/发布导向)--dev:在standard基础上,自动尝试将本地相邻子仓库切换为 editable(--no-deps覆盖)- 每次安装前会动态清理已安装的
isagellm-*历史包,保证流程可重入
# 标准模式(默认稳定依赖)
./quickstart.sh --standard
# 开发联调模式(本地 editable 覆盖)
./quickstart.sh --dev
# 查看帮助
./quickstart.sh --help
# 或手动安装
pip install -e ".[dev]"
运行测试:
pytest tests/ -v
Lint/格式化:
ruff format .
ruff check . --fix
提交流程:
- 创建 Issue
- 在
fix/#123-xxx分支开发 - 提交 PR 到
main-dev
版本信息
- 当前版本:0.5.0.0
- 变更记录:CHANGELOG.md
Related Repositories
- sagellm - Umbrella 包 + CLI
- sagellm-protocol - 协议定义
- sagellm-backend - 后端抽象
- sagellm-gateway - API 网关
License
Proprietary - IntelliStream
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isagellm_control_plane-0.5.4.14.tar.gz.
File metadata
- Download URL: isagellm_control_plane-0.5.4.14.tar.gz
- Upload date:
- Size: 195.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e597514364e139011c4d8ffb065f1509125d673a3fd1c7cfef0f6ddc552a7d6
|
|
| MD5 |
a6f2c40b08b461f8bab044c3a2b638c6
|
|
| BLAKE2b-256 |
c3c850006f7ab7b8018267b9eb34b0412671309281dc0ccad368756fdea7120d
|
File details
Details for the file isagellm_control_plane-0.5.4.14-py2.py3-none-any.whl.
File metadata
- Download URL: isagellm_control_plane-0.5.4.14-py2.py3-none-any.whl
- Upload date:
- Size: 234.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e789abf3be2cca2398561f973635ce1fa22d42d8186483c09ca8d87ffdfd2df
|
|
| MD5 |
4bfe26017c3058296269a189c78a550c
|
|
| BLAKE2b-256 |
1aea464ae0a4efd970f4a31314d2908d2eada23fceff2d936b7971e9397cf834
|