sageLLM core runtime with PD separation (MVP)

These details have not been verified by PyPI

Project description

sagellm-core

sageLLM Core 是一个硬件无关的 LLM 推理引擎，提供统一的推理接口（generate、stream、execute），支持自动后端选择（CPU/CUDA/Ascend），内置解码策略系统，并支持 PD 分离的混合模式执行。

版本: 0.4.0.17 | 最后更新: 2026-02-02 | 协议遵循: Protocol v0.1

📍 职责定位

在整个 sageLLM 架构中的位置与职责：

┌─────────────────────────────────────────────────────────────┐
│                  Application Layer                          │
│        (sagellm-gateway, sagellm-control-plane)             │
└────────────────┬────────────────────────────────────────────┘
                 │
┌────────────────┴────────────────────────────────────────────┐
│  sagellm-core 本仓库                                         │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  LLMEngine: 硬件无关的统一推理入口                    │   │
│  │  • generate() / stream() / execute()                 │   │
│  │  • 自动后端选择 (cpu/cuda/ascend)                    │   │
│  │  • Continuous Batching 调度                         │   │
│  │  • 解码策略系统 (Greedy/Sampling/BeamSearch)        │   │
│  │  • PD 分离混合模式执行                               │   │
│  └──────────────────────────────────────────────────────┘   │
├──────────────────────────────────────────────────────────────┤
│  核心依赖 (L1 层)                                            │
│  ├─ sagellm-backend: 硬件抽象、设备管理                    │
│  ├─ sagellm-comm: 通信硬件、TP/PP 通信                     │
│  ├─ sagellm-kv-cache: KV 缓存管理、驱逐策略              │
│  └─ sagellm-protocol: 数据结构、错误定义                  │
└──────────────────────────────────────────────────────────────┘

职责边界：

✅ Core 负责: LLMEngine、调度、推理编排、解码策略
✅ Backend 负责: 硬件抽象、设备管理、算子/内核
✅ Comm 负责: 通信硬件抽象、集合操作、拓扑管理
✅ Protocol 负责: 全局共享的数据结构、错误码、ID 方案

性能主路径边界

国产硬件性能优化的正式执行链是：scheduler -> executor -> worker -> model_runner -> backend attention/kernel。

shared stream 只能作为 admission、telemetry、event fan-out 层，不能长期拥有独立 decode 语义。
batch_type、block_tables、slot_mapping、context_lens 必须从 scheduler 贯穿到执行层。
vLLM 或 HuggingFace fallback 不是 sagellm-core 的长期性能方向。

规范说明见：https://github.com/intellistream/sagellm-docs/blob/main/docs/specs/performance_mainline_architecture.md

✨ 核心特性

特性	说明
统一推理接口	`generate()` / `stream()` / `execute()` - 同步、流式、协议兼容
硬件无关	CPU/CUDA/Ascend - 自动检测与选择
解码策略系统	Greedy、Sampling、Beam Search、Contrastive Decoding
Continuous Batching	动态批处理，充分利用硬件
PD 分离执行	Prefill 和 Decode 阶段分离，支持混合模式
配置驱动	YAML/JSON 配置，Pydantic v2 验证
HTTP Server	FastAPI 实现，支持 SSE 流式传输
CPU-First	完整支持无 GPU 环境，便于测试开发
类型安全	完整的 Python 类型标注，Mypy 支持

📦 依赖关系

核心依赖（自动安装）

isagellm-protocol>=0.4.0.0,<0.5.0  # 协议定义
isagellm-backend>=0.4.0.0,<0.5.0   # 硬件抽象
isagellm-comm>=0.4.0.0,<0.5.0      # 通信后端
isagellm-kv-cache>=0.4.0.0,<0.5.0  # KV 缓存管理

# 框架依赖
pydantic>=2.0.0      # 数据验证
pyyaml>=6.0.0        # 配置解析
torch>=2.0.0         # 张量计算
transformers>=4.35.0 # 模型加载
fastapi>=0.100.0     # HTTP 服务

谁依赖我

🔵 sagellm-control-plane: 使用 Core 进行请求调度、负载均衡
🟡 sagellm-compression: 建立在 Core 的模型执行层上
🟢 sagellm-gateway: 提供 OpenAI 兼容 API

🚀 安装指南

PyPI 安装（推荐）

# 安装最新版本
pip install isagellm-core==0.4.0.17

# 安装指定版本范围
pip install "isagellm-core>=0.4.0.0,<0.5.0"

本地开发安装

# 克隆仓库
git clone https://github.com/intellistream/sagellm-core.git
cd sagellm-core

# 方式 1：一键安装（推荐）
# 默认等价于 --dev
./quickstart.sh

# 标准模式（稳定/发布导向）：依赖优先来自 PyPI
./quickstart.sh --standard

# 开发模式（联调导向）：先安装 PyPI，再尽量用本地 sibling 仓库 editable 覆盖
./quickstart.sh --dev

# 查看帮助
./quickstart.sh --help

quickstart.sh 当前语义：

--standard：安装 PyPI 基线依赖 + 当前仓库 editable 安装
--dev：在 --standard 基础上，对本地存在的 sagellm-protocol / sagellm-backend / sagellm-comm / sagellm-kv-cache 执行 pip install -e <repo> --no-deps 覆盖
安装前会动态清理已安装的 isagellm-* 包（可用 --skip-cleanup 跳过）
失败时会输出详细诊断日志，便于定位安装问题

本地链接依赖（用于本地多包开发）

# 如果同时在开发 backend/protocol/comm，使用本地版本
pip install -e ../sagellm-protocol
pip install -e ../sagellm-backend  
pip install -e ../sagellm-comm
pip install -e ../sagellm-kv-cache
pip install -e ".[dev]"

验证安装

# 检查 package 版本
python -c "import sagellm_core; print(sagellm_core.__version__)"

# 运行快速测试
pytest tests/test_ci_smoke.py -v

🎯 快速开始

1. 基础推理

from sagellm_core import LLMEngine, LLMEngineConfig

# 创建配置
config = LLMEngineConfig(
    model_path="sshleifer/tiny-gpt2",  # HuggingFace 模型名或本地路径
    backend_type="cpu",  # 自动选择 cpu/cuda/ascend
    max_new_tokens=20
)

# 初始化引擎
engine = LLMEngine(config)

# 异步运行
import asyncio

async def main():
    await engine.start()

    # 同步生成（完整输出）
    response = await engine.generate("Hello, world!")
    print(response.output_text)

    # 流式生成（逐 token 返回）
    async for event in engine.stream("Once upon a time"):
        if event.event == "delta":
            print(event.chunk, end="", flush=True)

    await engine.stop()

asyncio.run(main())

2. 使用采样参数控制生成

from sagellm_core import LLMEngine, LLMEngineConfig
from sagellm_protocol.sampling import SamplingParams, DecodingStrategy

async def main():
    config = LLMEngineConfig(model_path="sshleifer/tiny-gpt2")
    engine = LLMEngine(config)
    await engine.start()

    prompt = "The future of AI is"

    # 确定性输出（Greedy）
    response = await engine.generate(
        prompt,
        sampling_params=SamplingParams(
            strategy=DecodingStrategy.GREEDY,
            max_tokens=20
        )
    )
    print(f"Greedy: {response.output_text}")

    # 随机采样（Temperature 控制）
    response = await engine.generate(
        prompt,
        sampling_params=SamplingParams(
            strategy=DecodingStrategy.SAMPLING,
            temperature=0.7,
            top_p=0.9,
            max_tokens=20
        )
    )
    print(f"Sampling: {response.output_text}")

    await engine.stop()

asyncio.run(main())

3. 从 YAML 配置文件运行 Demo

# 查看可用配置
cat examples/config_cpu.yaml

# 运行 Demo
python -m sagellm_core.demo --config examples/config_cpu.yaml --verbose

# 查看输出 metrics
cat metrics.json

4. 启动 HTTP Server

# 方式 1：命令行
sage-engine --host 0.0.0.0 --port 8000

# 方式 2：Python API
from sagellm_core import engine_server_app
import uvicorn

uvicorn.run(engine_server_app, host="0.0.0.0", port=8000)

5. HTTP 请求示例

# 同步推理
curl -X POST http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt2",
    "prompt": "Hello",
    "max_tokens": 20
  }'

# 流式推理
curl -X POST http://localhost:8000/v1/completions/stream \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt2",
    "prompt": "Hello",
    "max_tokens": 20,
    "stream": true
  }'

📚 API 文档

LLMEngine - 主入口

初始化:

LLMEngineConfig(
    model_path: str,                # 必需：HuggingFace 名或本地路径
    backend_type: str = "auto",     # 计算后端：cpu/cuda/ascend/auto
    comm_type: str = "auto",        # 通信后端：gloo/nccl/hccl/auto
    max_batch_size: int = 32,       # 最大批大小
    max_model_len: int = 4096,      # 最大序列长度
    max_new_tokens: int = 128,      # 每个请求最多生成 token 数
    tensor_parallel_size: int = 1,  # 张量并行度
    pipeline_parallel_size: int = 1, # 流水线并行度
    dtype: str = "auto",             # 数据类型：float32/float16/bfloat16
)

关键方法:

async def start() -> None:
    """启动引擎，加载模型"""

async def stop() -> None:
    """停止引擎，释放资源"""

async def generate(
    prompt: str | list[int],
    *,
    sampling_params: SamplingParams | None = None,
    max_tokens: int | None = None,
    request_id: str | None = None,
) -> Response:
    """同步推理，返回完整输出"""

async def stream(
    prompt_or_request: str | Request,
    *,
    max_tokens: int | None = None,
    request_id: str | None = None,
) -> AsyncIterator[StreamEvent]:
    """流式推理，逐 token 返回事件"""

async def execute(request: Request) -> Response:
    """执行 Protocol Request，用于兼容旧接口"""

SamplingParams - 采样参数

from sagellm_protocol.sampling import SamplingParams, DecodingStrategy

SamplingParams(
    strategy: DecodingStrategy = DecodingStrategy.GREEDY,
    temperature: float = 0.0,   # 越高越随机
    top_p: float = 1.0,         # Nucleus 采样
    top_k: int = 0,             # Top-K 采样
    repetition_penalty: float = 1.0,
    length_penalty: float = 1.0,
    num_beams: int = 1,         # Beam Search 宽度
    max_tokens: int = 128,
    seed: int | None = None,    # 可复现性
)

其他核心 API

# 配置加载（Legacy）
from sagellm_core import load_config
config = load_config("config.yaml")  # 支持 YAML/JSON

# 后端创建（Legacy）
from sagellm_core import create_backend, BackendConfig
backend = create_backend(BackendConfig(kind="cpu"))

# 工厂方法（Legacy）
from sagellm_core import EngineFactory
factory = EngineFactory()
engine = factory.create("cpu")  # 支持自动发现

🏗️ 架构设计

分层架构

┌──────────────────────────────────┐
│    LLMEngine (对外 API)           │  ← 用户交互层
│  • generate/stream/execute       │
└────────┬─────────────────────────┘
         │
┌────────▼──────────────────────────┐
│    EngineCore (引擎核心)          │  ← 推理协调层
│  • Scheduler: Continuous Batching │
│  • Executor: 工作进程管理          │
│  • KVCacheManager: 缓存管理        │
└────────┬──────────────────────────┘
         │
┌────────▼──────────────────────────┐
│    Worker & ModelRunner           │  ← 执行层
│  • 前向传播                        │
│  • TP/PP 通信                      │
│  • 硬件资源管理                    │
└────────┬──────────────────────────┘
         │
    ┌────┴────┬───────────┬────────────┐
    ▼         ▼           ▼            ▼
  Backend   Comm      KV-Cache     Protocol

模块说明

模块	路径	职责
llm_engine	`src/sagellm_core/llm_engine.py`	统一推理入口
engine_core	`src/sagellm_core/engine_core/`	调度与执行协调
scheduler	`src/sagellm_core/engine_core/scheduler.py`	Continuous Batching
executor	`src/sagellm_core/executor/`	Worker 管理
worker	`src/sagellm_core/worker/`	单设备执行
decoding	`src/sagellm_core/decoding/`	5+ 种解码策略
runtime	`src/sagellm_core/runtime.py`	PD 分离 Runtime
pd_executor	`src/sagellm_core/pd_executor.py`	Prefill/Decode 分离
engine_server	`src/sagellm_core/engine_server.py`	HTTP 服务

🔧 开发指南

项目结构

sagellm-core/
├── src/sagellm_core/           # 源代码
│   ├── llm_engine.py           # 统一推理引擎
│   ├── engine_core/            # 引擎核心（调度+执行）
│   ├── executor/               # Worker 执行器
│   ├── worker/                 # Worker 和 ModelRunner
│   ├── decoding/               # 解码策略（Greedy/Sampling/...)
│   ├── engine_server.py        # HTTP Server (FastAPI)
│   ├── config.py               # 配置类（Legacy）
│   ├── factory.py              # 工厂方法（Legacy）
│   ├── runtime.py              # PD 分离 Runtime
│   ├── pd_executor.py          # PD 分离执行器
│   └── ...
├── tests/                      # 测试用例
│   ├── unit/                   # 单元测试
│   ├── integration/            # 集成测试
│   ├── e2e/                    # 端到端测试
│   └── conftest.py             # Pytest 配置
├── examples/                   # 示例代码
│   ├── config_cpu.yaml         # CPU 配置示例
│   ├── config_cuda.yaml        # CUDA 配置示例
│   ├── decoding_strategies_demo.py  # 解码策略演示
│   ├── pd_separation_demo.py   # PD 分离演示
│   └── ...
├── docs/                       # 文档
│   ├── ARCHITECTURE.md         # 详细架构
│   ├── DECODING_STRATEGIES.md  # 解码策略指南
│   └── ...
├── pyproject.toml              # 项目配置（setuptools）
├── pytest.ini                  # Pytest 配置
├── .pre-commit-config.yaml     # Pre-commit hooks
└── quickstart.sh               # 快速安装脚本

环境设置

# 克隆并进入项目
git clone https://github.com/intellistream/sagellm-core.git
cd sagellm-core

# 安装开发依赖
pip install -e ".[dev]"

# 安装 git hooks（提交前自动检查）
pre-commit install

# 验证安装
python -m pytest tests/test_ci_smoke.py -v

运行测试

# 运行所有测试
pytest tests/ -v

# 运行特定测试模块
pytest tests/unit/test_config.py -v

# 运行带覆盖率报告
pytest tests/ --cov=sagellm_core --cov-report=html

# 运行 slow 标记的测试（包括 LLM 测试）
pytest tests/ -v -m slow

# 运行单个测试用例
pytest tests/test_llm_engine.py::test_engine_generate -v

代码质量检查

# Ruff 代码格式化 + Lint 检查
ruff check . --fix       # 自动修复可修复的问题
ruff format .            # 格式化代码

# Mypy 静态类型检查
mypy src/

# 手动运行所有 pre-commit hooks
pre-commit run --all-files

# 运行特定 hook
pre-commit run ruff --all-files
pre-commit run mypy --all-files

Git 提交流程

创建特性分支

git checkout -b feature/your-feature-name

提交代码（hooks 会自动检查）
```
git add .
git commit -m "feat: add your feature description"
```
- 如果 hooks 失败，修复问题后重新提交

推送并提 PR

git push origin feature/your-feature-name

常见开发任务

添加新的解码策略:

在 src/sagellm_core/decoding/ 创建新文件
继承 BaseDecodingStrategy
实现 __call__() 方法
在 __init__.py 中导出
添加单元测试

添加新的后端支持:

在 sagellm-backend 实现 BackendProvider
在 Core 中使用 get_provider() 自动发现
添加集成测试

添加配置选项:

修改 src/sagellm_core/config.py 中的 Pydantic 模型
在示例配置文件中更新示例
更新文档和测试

📖 示例代码

完整的演示应用

# 运行解码策略完整演示（包含 6 个场景）
python examples/decoding_strategies_demo.py

# 运行 PD 分离演示
python examples/pd_separation_demo.py

CPU-First 测试

所有测试默认在 CPU 上运行（无 GPU 要求）：

# 测试 LLMEngine
pytest tests/test_engine.py -v

# 测试配置系统
pytest tests/test_config.py -v

# 测试解码策略
pytest tests/test_decoding_strategies.py -v

# 测试 E2E 流程
pytest tests/test_llm_engine_contract.py -v

模型下载

使用提供的帮助脚本下载测试模型：

# 下载 tiny-gpt2（用于测试）
python examples/model_download_helper.py

# 或手动下载
python -c "from transformers import AutoModel; AutoModel.from_pretrained('sshleifer/tiny-gpt2')"

🔄 持续集成

本项目使用 GitHub Actions 进行 CI/CD：

单元测试: 每次 push 运行 pytest tests/unit/
集成测试: 每次 push 运行 pytest tests/integration/
Lint 检查: Ruff、Mypy、YAML 验证
覆盖率: 维持 >80% 的代码覆盖率

查看 CI 配置：.github/workflows/ci.yml

📋 版本与变更

当前版本: 0.4.0.17 (Alpha)

支持的 Python: 3.10, 3.11, 3.12

完整变更日志: 见 CHANGELOG.md

最近更新 (v0.4.0.17):

✅ 采样参数标准化（issue #22）- 参数优先级系统
✅ 增强解码策略测试
✅ 完成 LLMEngine 与解码策略的集成测试
✅ 解码策略使用演示与文档

🤝 贡献指南

我们欢迎社区贡献！请遵循以下步骤：

Fork 仓库
创建特性分支 (git checkout -b feature/your-feature)
提交更改 (git commit -m "feat: description")
推送到分支 (git push origin feature/your-feature)
提交 Pull Request

提交规范

使用 Conventional Commits：

feat: 新增功能
fix: 修复 bug
docs: 文档更新
test: 测试相关
refactor: 代码重构
perf: 性能优化

📄 许可证

Proprietary - IntelliStream

📞 反馈与支持

📍 GitHub Issues: 提交问题
💬 讨论: 启动讨论
📧 Email: team@intellistream.ai

依赖

pydantic>=2.0.0: 配置校验
pyyaml>=6.0.0: YAML 配置支持
isagellm-protocol>=0.4.0.0,<0.5.0: 协议定义
isagellm-backend>=0.4.0.0,<0.5.0: 后端抽象
isagellm-comm>=0.4.0.0,<0.5.0: 通信后端
isagellm-kv-cache>=0.4.0.0,<0.5.0: KV 缓存

Related Packages

isagellm-protocol - Protocol definitions (L0)
isagellm-backend - Backend abstraction layer (L1)
isagellm-comm - Communication abstraction (L1)
isagellm-kv-cache - KV Cache management (L1.5)
sagellm-control-plane - Cross-engine orchestration (L3)
sagellm-gateway - OpenAI-compatible API (L4)

For the complete ecosystem, see sageLLM organization

Last Updated: 2026-02-02 | Status: Alpha (v0.4.0.17) | Protocol: v0.1

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.5.4.26

Mar 12, 2026

0.5.4.25

Mar 9, 2026

0.5.4.24

Mar 8, 2026

0.5.4.22

Mar 7, 2026

0.5.4.21

Mar 7, 2026

0.5.4.18

Mar 6, 2026

0.5.4.17

Mar 5, 2026

0.5.4.15

Mar 5, 2026

0.5.4.12

Mar 4, 2026

0.5.4.11

Mar 4, 2026

0.5.4.10

Mar 4, 2026

0.5.4.9

Mar 4, 2026

0.5.4.8

Mar 4, 2026

0.5.4.7

Mar 4, 2026

0.5.4.6

Mar 3, 2026

0.5.4.5

Mar 1, 2026

0.5.4.4

Mar 1, 2026

0.5.4.3

Mar 1, 2026

0.5.4.2

Mar 1, 2026

0.5.4.1

Feb 28, 2026

0.5.4.0

Feb 27, 2026

0.5.2.25

Feb 26, 2026

0.5.2.24

Feb 26, 2026

0.5.2.23

Feb 26, 2026

0.5.2.21

Feb 26, 2026

0.5.2.20

Feb 26, 2026

0.5.2.19

Feb 26, 2026

0.5.2.17

Feb 26, 2026

0.5.2.14

Feb 26, 2026

0.5.2.13

Feb 26, 2026

0.5.2.12

Feb 25, 2026

0.5.2.11

Feb 25, 2026

0.5.2.10

Feb 25, 2026

0.5.2.9

Feb 25, 2026

0.5.2.8

Feb 25, 2026

0.5.2.7

Feb 25, 2026

0.5.2.6

Feb 25, 2026

0.5.2.5

Feb 25, 2026

0.5.2.3

Feb 25, 2026

0.5.2.2

Feb 23, 2026

0.5.2.1

Feb 23, 2026

0.5.2.0

Feb 23, 2026

0.5.1.9

Feb 23, 2026

0.5.1.8

Feb 23, 2026

0.5.1.7

Feb 20, 2026

0.5.1.6

Feb 20, 2026

0.5.1.4

Feb 20, 2026

0.5.1.3

Feb 20, 2026

0.5.1.2

Feb 20, 2026

0.5.1.1

Feb 20, 2026

0.5.1.0

Feb 17, 2026

0.4.3.3

Feb 17, 2026

0.4.3.2

Feb 17, 2026

0.4.3.1

Feb 15, 2026

0.4.3.0

Feb 15, 2026

0.4.2.5

Feb 15, 2026

0.4.2.4

Feb 15, 2026

0.4.2.3

Feb 15, 2026

0.4.2.2

Feb 15, 2026

0.4.2.1

Feb 15, 2026

0.4.2.0

Feb 15, 2026

0.4.1.7

Feb 15, 2026

0.4.1.6

Feb 15, 2026

0.4.1.5

Feb 14, 2026

0.4.1.4

Feb 14, 2026

0.4.1.3

Feb 14, 2026

0.4.1.2

Feb 14, 2026

0.4.1.1

Feb 14, 2026

0.4.1.0

Feb 7, 2026

0.4.0.22

Feb 7, 2026

0.4.0.19

Feb 3, 2026

0.4.0.18

Feb 1, 2026

0.4.0.17

Feb 1, 2026

0.4.0.11

Jan 31, 2026

0.4.0.9

Jan 31, 2026

0.4.0.8

Jan 30, 2026

0.4.0.7

Jan 30, 2026

0.4.0.6

Jan 30, 2026

0.4.0.5

Jan 30, 2026

0.4.0.4

Jan 30, 2026

0.4.0.3

Jan 30, 2026

0.4.0.2

Jan 30, 2026

0.4.0.1

Jan 30, 2026

0.4.0.0

Jan 30, 2026

0.3.0.10

Jan 29, 2026

0.3.0.9

Jan 29, 2026

0.3.0.8

Jan 29, 2026

0.3.0.7

Jan 29, 2026

0.3.0.6

Jan 29, 2026

0.3.0.5

Jan 28, 2026

0.3.0.4

Jan 27, 2026

0.3.0.3

Jan 27, 2026

0.3.0.2

Jan 27, 2026

0.3.0.1

Jan 27, 2026

0.2.2.8

Jan 27, 2026

0.2.2.7

Jan 27, 2026

0.2.2.6

Jan 26, 2026

0.2.2.5

Jan 26, 2026

0.2.2.4

Jan 26, 2026

0.2.2.3

Jan 26, 2026

0.2.2.2

Jan 25, 2026

0.2.2.1

Jan 21, 2026

0.2.2.0

Jan 20, 2026

0.2.1.0

Jan 20, 2026

0.2.0.0

Jan 20, 2026

0.1.0.3

Jan 18, 2026

0.1.0.2

Jan 17, 2026

0.1.0.1

Jan 15, 2026

0.1.0

Jan 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isagellm_core-0.5.4.26.tar.gz (456.5 kB view details)

Uploaded Mar 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

isagellm_core-0.5.4.26-py2.py3-none-any.whl (538.5 kB view details)

Uploaded Mar 12, 2026 Python 2Python 3

File details

Details for the file isagellm_core-0.5.4.26.tar.gz.

File metadata

Download URL: isagellm_core-0.5.4.26.tar.gz
Upload date: Mar 12, 2026
Size: 456.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for isagellm_core-0.5.4.26.tar.gz
Algorithm	Hash digest
SHA256	`54c125a91948f8bb02da03a144a72e921222c6eb50a01a82b90718ad9280bf1f`
MD5	`b2c8f0f85d0f0835fc46e42dc39c2f0b`
BLAKE2b-256	`e651e5dbdd97567498fa2432d23a4ae028864b77cf55953a1fb4a901ef1a8696`

See more details on using hashes here.

File details

Details for the file isagellm_core-0.5.4.26-py2.py3-none-any.whl.

File metadata

Download URL: isagellm_core-0.5.4.26-py2.py3-none-any.whl
Upload date: Mar 12, 2026
Size: 538.5 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for isagellm_core-0.5.4.26-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`ec1796c59650b2f3f5febc5fc57a2300d74624b5a024a28c589dcaf8dd045df0`
MD5	`f091dad7e7c2b52511f383d534169a98`
BLAKE2b-256	`c509057e595989d6ebbc6512d8bf061afef042b67115ed6ea06b3a4ab903f990`

See more details on using hashes here.

isagellm-core 0.5.4.26

Navigation

Verified details

Owner

Maintainers

Unverified details

Meta

Classifiers

Project description

sagellm-core

📍 职责定位

性能主路径边界

✨ 核心特性

📦 依赖关系

核心依赖（自动安装）

谁依赖我

🚀 安装指南

PyPI 安装（推荐）

本地开发安装

本地链接依赖（用于本地多包开发）

验证安装

🎯 快速开始

1. 基础推理

2. 使用采样参数控制生成

3. 从 YAML 配置文件运行 Demo

4. 启动 HTTP Server

5. HTTP 请求示例

📚 API 文档

LLMEngine - 主入口

SamplingParams - 采样参数

其他核心 API

🏗️ 架构设计

分层架构

模块说明

🔧 开发指南

项目结构

环境设置

运行测试

代码质量检查

Git 提交流程

常见开发任务

📖 示例代码

完整的演示应用

CPU-First 测试

模型下载

🔄 持续集成

📋 版本与变更

🤝 贡献指南

提交规范

📄 许可证

📞 反馈与支持

相关资源

Continuous Integration

Code Style

代码检查

依赖

Related Packages

Project details

Verified details

Owner

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes