sageLLM: Modular LLM inference engine with PD separation for domestic computing power

These details have not been verified by PyPI

Project links

Project description

sageLLM

Protocol Compliance (Mandatory)

MUST follow Protocol v0.1: https://github.com/intellistream/sagellm-docs/blob/main/docs/specs/protocol_v0.1.md
Any globally shared definitions (fields, error codes, metrics, IDs, schemas) MUST be added to Protocol first.

🚀 Modular LLM Inference Engine for Domestic Computing Power

Ollama-like experience for Chinese hardware ecosystems (Huawei Ascend, NVIDIA)

✨ Features

🎯 One-Click Install - pip install isagellm gets you started immediately
🧠 CPU-First - Default CPU engine, no GPU required
🇨🇳 Domestic Hardware - First-class support for Huawei Ascend NPU
📊 Observable - Built-in metrics (TTFT, TBT, throughput, KV usage)
🧩 Plugin System - Extend with custom backends and engines
🔄 Mixed Inference - Unified LLM + Embedding client (MixedInferenceClient)
🦙 Ollama Backend - Use a local Ollama server as an inference backend
📈 Performance Profiling - Load profiling data and interpolate TTFT/throughput

📦 Quick Install

# Install sageLLM (recommended, includes gateway/control-plane/kv/comm/compression)
pip install isagellm

# Optional extra: embedding toolkit
pip install 'isagellm[embedding]'

# Reproduce exactly-tested sub-package versions (recommended for production)
pip install isagellm -c https://raw.githubusercontent.com/intellistream/sagellm/main-dev/constraints.txt

🚀 国内加速安装 PyTorch（推荐）

由于 PyTorch CUDA 版本从官方源下载较慢（~800MB），我们在 GitHub Releases 提供预先下载的 wheels：

# 方法 1：使用 sagellm CLI (推荐，最简单)
pip install isagellm
sage-llm install cuda --github     # 从 GitHub 下载，快速
sage-llm install cuda              # 从官方源下载（默认）

# 方法 2：直接使用 pip --find-links
pip install torch==2.5.1+cu121 torchvision torchaudio \
  --find-links https://github.com/intellistream/sagellm-pytorch-wheels/releases/download/v2.5.1-cu121/ \
  --trusted-host github.com

其他支持的后端：

sage-llm install ascend - 华为昇腾 NPU
sage-llm install kunlun - 百度昆仑 XPU
sage-llm install haiguang - 海光 DCU
sage-llm install cpu - CPU-only（最小下载）

💡 为什么使用 GitHub 加速？

✅ 国内访问速度快（GitHub CDN）
✅ 无需配置镜像源
✅ 官方 wheels，100% 可信

📦 Wheels 仓库: https://github.com/intellistream/sagellm-pytorch-wheels

🚀 Quick Start

CLI 命令统一

统一主命令：sagellm
兼容别名：sage-llm（保留向后兼容，建议迁移到 sagellm）

CLI (像 vLLM/Ollama 一样简单)

# 一键启动（完整栈：Gateway + Engine）
pip install isagellm
sage-llm serve --model Qwen2-7B

# ✅ OpenAI API 自动可用
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen2-7B",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# 查看系统信息
sage-llm info

# 单次推理（不启动服务器）
sage-llm run -p "What is LLM inference?"

# 推荐生产启动（通过 Gateway + Control Plane）
sage-llm serve --backend cpu --model sshleifer/tiny-gpt2 --port 8888
sage-llm serve \
  --backend cpu \
  --model sshleifer/tiny-gpt2 \
  --port 8888 \
  --with-embedding \
  --embedding-model sentence-transformers/all-MiniLM-L6-v2

# 一键启动：Gateway + LLM + Embedding
sage-llm serve \
  --backend cpu \
  --model sshleifer/tiny-gpt2 \
  --port 8888 \
  --with-embedding \
  --embedding-model sentence-transformers/all-MiniLM-L6-v2

# 生产推荐：上线前模型健康防护（预检 + 启动检查 + 周期巡检 + 备用模型）
export SAGELLM_PREFLIGHT_CANARY=1
export SAGELLM_STARTUP_CANARY=1
export SAGELLM_PERIODIC_CANARY=1
export SAGELLM_FALLBACK_MODEL="Qwen/Qwen2.5-0.5B-Instruct"
export SAGELLM_CANARY_INTERVAL_SEC=300
export SAGELLM_CANARY_FAIL_THRESHOLD=3

sage-llm serve \
  --backend pytorch-cuda \
  --model Qwen/Qwen2.5-1.5B-Instruct \
  --port 8000

🛡️ 生产健康防护（Canary + Fallback）

sage-llm serve 支持上线安全防护，避免“模型已损坏但服务仍对外提供垃圾输出”：

Preflight Canary（默认开启）：服务启动前，本地加载并测试主模型；失败则自动尝试备用模型。
Startup Canary（默认开启）：Engine 健康后，立即发送固定测试请求；失败则 fail-fast 退出。
Periodic Canary（默认开启）：后台定期巡检输出质量；连续失败达到阈值后熔断退出（交由 supervisor 重启）。

可用环境变量：

SAGELLM_PREFLIGHT_CANARY：是否启用启动前预检（默认 1）
SAGELLM_STARTUP_CANARY：是否启用启动后一次性检查（默认 1）
SAGELLM_PERIODIC_CANARY：是否启用周期巡检（默认 1）
SAGELLM_FALLBACK_MODEL：备用模型列表（逗号分隔，按顺序回退）
SAGELLM_CANARY_INTERVAL_SEC：周期巡检间隔秒数（默认 300）
SAGELLM_CANARY_FAIL_THRESHOLD：连续失败熔断阈值（默认 3）

示例（主模型异常时自动切换到 0.5B）：

export SAGELLM_FALLBACK_MODEL="Qwen/Qwen2.5-0.5B-Instruct"
sage-llm serve --model Qwen/Qwen2.5-1.5B-Instruct

Python API (Control Plane - Recommended)

import asyncio

from sagellm import ControlPlaneManager, BackendConfig, EngineConfig

# Install with: pip install isagellm
async def main() -> None:
    manager = ControlPlaneManager(
        backend_config=BackendConfig(kind="cpu", device="cpu"),
        engine_configs=[
            EngineConfig(
                kind="cpu",
                model="sshleifer/tiny-gpt2",
                model_path="sshleifer/tiny-gpt2"
            )
        ]
    )

    await manager.start()
    try:
        # Requests are automatically routed to available engines
        response = await manager.execute_request(
            prompt="Hello, world!",
            max_tokens=128
        )
        print(response.output_text)
        print(f"TTFT: {response.metrics.ttft_ms:.2f} ms")
        print(f"Throughput: {response.metrics.throughput_tps:.2f} tokens/s")
    finally:
        await manager.stop()


asyncio.run(main())

⚠️ Important: Direct engine creation (create_engine()) is not exported from the umbrella package. All production code must use ControlPlaneManager for proper request routing, scheduling, and lifecycle management.

Mixed Inference (LLM + Embedding)

from sagellm import MixedInferenceClient, MixedRequest, RequestKind

# Unified client for both LLM and embedding
client = MixedInferenceClient(
    llm_url="http://localhost:8000",
    embedding_url="http://localhost:8001",
)

# LLM completion
resp = client.complete("What is 2+2?")
print(resp["text"])

# Embedding
vecs = client.embed("Hello world")

# Mixed batch dispatch
results = client.dispatch([
    MixedRequest(kind=RequestKind.LLM, content="Tell me a joke"),
    MixedRequest(kind=RequestKind.EMBEDDING, content="The quick brown fox"),
])

Ollama Backend

# Use a local Ollama server as inference backend
sage-llm ollama status                       # check health
sage-llm ollama list                         # list models
sage-llm ollama run -m llama3 -p "Hello!"    # single completion
sage-llm ollama chat -p "Explain Python"     # chat

from sagellm import OllamaClient

client = OllamaClient(model="llama3")
resp = client.complete("What is 2+2?")
print(resp["text"])
models = client.list_models()

Performance Profiling & Interpolation

from sagellm.profiling import PerformanceInterpolator

# Load CSV: columns isl, ttft, itl, throughput
interp = PerformanceInterpolator.from_csv("profiles/qwen2_7b_a100.csv")

# Predict metrics for a given input sequence length
ttft = interp.predict_ttft(512)          # → seconds
itl  = interp.predict_itl(512)           # → seconds/token
tput = interp.predict_throughput(512)    # → tokens/second

# Reverse: find max ISL that satisfies a TTFT budget
max_isl = interp.reverse_ttft(target_ttft=0.3)
print(f"Max ISL for 300ms TTFT: {max_isl} tokens")

Configuration

# ~/.sagellm/config.yaml
backend:
  kind: cpu  # Options: cpu, pytorch-cuda, pytorch-ascend
  device: cpu

engine:
  kind: cpu
  model: sshleifer/tiny-gpt2

control_plane:
  endpoint: "localhost:8080"

📊 Metrics & Validation

sageLLM provides comprehensive performance metrics:

{
  "ttft_ms": 45.2,
  "tbt_ms": 12.5,
  "throughput_tps": 80.0,
  "peak_mem_mb": 24576,
  "kv_used_tokens": 4096,
  "prefix_hit_rate": 0.85
}

Run benchmarks:

sage-llm demo --workload year1 --output metrics.json

🏗️ Architecture

isagellm (umbrella package)
├── isagellm-protocol       # Protocol v0.1 types
│   └── Request, Response, Metrics, Error, StreamEvent
├── isagellm-backend        # Hardware abstraction (L1 - Foundation)
│   └── BackendProvider, CPUBackend, (CUDABackend, AscendBackend)
├── isagellm-comm           # Communication primitives (L2 - Infrastructure)
│   └── Topology, CollectiveOps (all_reduce/gather), P2P (send/recv), Overlap
├── isagellm-kv-cache       # KV cache management (L2 - Optional)
│   └── PrefixCache, MemoryPool, EvictionPolicies, Predictor, KV Transfer
├── isagellm-compression    # Inference acceleration (quantization, sparsity, etc.) (L2 - Optional)
│   └── Quantization, Sparsity, SpeculativeDecoding, Fusion
├── isagellm-core           # Engine core & runtime (L3)
│   └── Config, Engine, Factory, DemoRunner, Adapters (vLLM/LMDeploy)
├── isagellm-control-plane  # Request routing & scheduling (L4 - Optional)
│   └── ControlPlaneManager, Router, Policies, Lifecycle
└── isagellm-gateway        # OpenAI-compatible REST API (L5 - Optional)
    └── FastAPI server, /v1/chat/completions, Session management

🔧 Development

Quick Setup (Development Mode)

# Clone all repositories
./scripts/clone-all-repos.sh

# 默认 dev 模式：安装 sagellm + 子仓库（editable）
./quickstart.sh

# standard 模式：子仓库走 PyPI，sagellm 本体保持本地 editable
./quickstart.sh --standard

# 非交互模式（CI/脚本）
./quickstart.sh --yes

# 可选：跳过清理旧 isagellm* 包
./quickstart.sh --skip-cleanup

# 可选：镜像控制
./quickstart.sh --use-mirror auto
./quickstart.sh --no-mirror

# Open all repos in VS Code Multi-root Workspace
code sagellm.code-workspace

环境说明：

不允许在 venv/.venv 中运行 quickstart。
推荐使用已有 Conda 环境。
若使用系统 Python，脚本会进行确认（--yes 可自动确认）。

📖 See WORKSPACE_GUIDE.md for Multi-root Workspace usage.

Testing

# Clone and setup
git clone https://github.com/IntelliStream/sagellm.git
cd sagellm
pip install -e ".[dev]"

### ⚠️ GitHub Actions 账单阻塞时的本地替代 CI

如果 Actions 因 billing/quota 无法启动，可在仓库根目录执行：

```bash
bash scripts/local_ci_fallback.sh

脚本会按 ci.yml 的核心顺序执行（pre-commit、version-check、CPU 单测/集成测试、CLI smoke、build+twine check），用于 issue/PR 附可复现结论。

Run tests

pytest -v

Format & lint

ruff format . ruff check . --fix

Type check

mypy src/sagellm/

Verify dependency hierarchy

python scripts/verify_dependencies.py


### 📖 Development Resources

- **[DEPLOYMENT_GUIDE.md](docs/DEPLOYMENT_GUIDE.md)** - 完整部署与配置指南
- **[TROUBLESHOOTING.md](docs/TROUBLESHOOTING.md)** - 故障排查快速参考
- **[ENVIRONMENT_VARIABLES.md](docs/ENVIRONMENT_VARIABLES.md)** - 环境变量完整参考
- **[DEVELOPER_GUIDE.md](docs/DEVELOPER_GUIDE.md)** - 开发者指南
- **[WORKSPACE_GUIDE.md](docs/WORKSPACE_GUIDE.md)** - Multi-root Workspace 使用
- **[INFERENCE_FLOW.md](docs/INFERENCE_FLOW.md)** - 推理流程详解
- **[PR_CHECKLIST.md](docs/PR_CHECKLIST.md)** - Pull Request 检查清单

______________________________________________________________________

## 📚 Documentation Index

### 用户文档

- [快速开始](README.md#-quick-start) - 5 分钟上手
- [部署指南](docs/DEPLOYMENT_GUIDE.md) - 生产环境部署
- [配置参考](docs/DEPLOYMENT_GUIDE.md#%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6%E8%AF%B4%E6%98%8E) - 完整配置选项
- [环境变量](docs/ENVIRONMENT_VARIABLES.md) - 环境变量参考
- [故障排查](docs/TROUBLESHOOTING.md) - 常见问题解决

### 开发者文档

- [开发指南](docs/DEVELOPER_GUIDE.md) - 贡献代码
- [架构设计](README.md#-architecture) - 系统架构
- [Workspace 使用](docs/WORKSPACE_GUIDE.md) - Multi-root 工作区
- [PR 检查清单](docs/PR_CHECKLIST.md) - 提交前检查

### API 文档

- OpenAI 兼容 API - 参见 [sagellm-gateway](https://github.com/intellistream/sagellm-gateway)
- Python API - 参见 [API_REFERENCE.md](docs/API_REFERENCE.md)（待补充）

### 子包文档

- [sagellm-protocol](https://github.com/intellistream/sagellm-protocol) - 协议定义

- [sagellm-backend](https://github.com/intellistream/sagellm-backend) - 后端抽象

- [sagellm-core](https://github.com/intellistream/sagellm-core) - 引擎核心

- [sagellm-control-plane](https://github.com/intellistream/sagellm-control-plane) - 控制面

- [sagellm-gateway](https://github.com/intellistream/sagellm-gateway) - API 网关

- [sagellm-benchmark](https://github.com/intellistream/sagellm-benchmark) - 基准测试

- [**DEVELOPER_GUIDE.md**](DEVELOPER_GUIDE.md) - 架构规范与开发指南

- [**PR_CHECKLIST.md**](PR_CHECKLIST.md) - Pull Request 审查清单

- [**scripts/verify_dependencies.py**](scripts/verify_dependencies.py) - 依赖层次验证

## � 贡献指南

### 工作流程（必须遵循）

在提交代码前，**必须**严格遵循以下步骤：

#### 1️⃣ 创建 Issue

描述你要解决的问题、实现的功能或改进：

```bash
gh issue create \
  --title "[Category] 简短描述" \
  --label "bug,sagellm-core" \
  --body "详细描述..."

Issue 类型：

[Bug] - Bug 修复
[Feature] - 新功能
[Performance] - 性能优化
[Integration] - 与其他模块集成
[Docs] - 文档改进

2️⃣ 在本地分支开发

创建开发分支并解决问题：

# 从 main-dev 创建分支（不是 main！）
git fetch origin main-dev
git checkout -b fix/#123-short-description origin/main-dev

# 进行开发
# ...

# 确保通过所有检查
ruff format .
ruff check . --fix
pytest -v

分支命名约定：

Bug 修复：bugfix/#123-xxx
新功能：feature/#456-xxx
文档：docs/#789-xxx
性能：perf/#101-xxx

3️⃣ 发起 Pull Request

提交代码供审查：

git push origin fix/#123-short-description
gh pr create \
  --base main-dev \
  --head fix/#123-short-description \
  --title "Fix: [简短描述]" \
  --body "解决 #123

## 改动
- 改动 1
- 改动 2

## 测试
- 新增单元测试
- 所有测试通过 ✓"

PR 必须包含：

清晰的标题（Fix/Feature/Docs/Perf）
关联 issue 号：Closes #123
改动列表和测试说明
通过所有 CI 检查

4️⃣ 代码审查与合并

等待审批后合并到 main-dev：

# 在 GitHub 界面点击"Merge"按钮
# 合并到 main-dev（不是 main！）

合并前条件：

✅ 至少一名维护者审批
✅ CI 检查全部通过（pytest, ruff）
✅ 合并到 main-dev 分支

快速检查清单

在发起 PR 前检查：

从 main-dev 分支创建开发分支
更新了 CHANGELOG.md
ruff format . 格式化代码
ruff check . --fix 通过 lint
pytest -v 通过所有测试
关联了相关 issue：Closes #123

反面例子 ❌

❌ 直接在 main 分支提交
❌ PR 中没有关联 issue
❌ 修改了代码但没有更新 CHANGELOG
❌ 代码没有通过 lint 检查
❌ 提交前没有运行测试

�📚 Package Details

Package	PyPI Name	Import Name	Description
sagellm	`isagellm`	`sagellm`	Umbrella package (install this)
sagellm-protocol	`isagellm-protocol`	`sagellm_protocol`	Protocol v0.1 types
sagellm-core	`isagellm-core`	`sagellm_core`	Runtime & config
sagellm-backend	`isagellm-backend`	`sagellm_backend`	Hardware abstraction

📄 License

Proprietary - IntelliStream. Internal use only.

_{Built with ❤️ by IntelliStream Team for domestic AI infrastructure}

# test

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.4.70

Mar 12, 2026

0.5.4.59

Mar 6, 2026

0.5.4.58

Mar 5, 2026

0.5.4.55

Mar 5, 2026

0.5.4.44

Mar 4, 2026

0.5.4.43

Mar 4, 2026

0.5.4.42

Mar 4, 2026

0.5.4.41

Mar 4, 2026

0.5.4.39

Mar 4, 2026

0.5.4.38

Mar 4, 2026

0.5.4.37

Mar 4, 2026

0.5.4.36

Mar 4, 2026

0.5.4.35

Mar 4, 2026

0.5.4.34

Mar 3, 2026

0.5.4.33

Mar 3, 2026

0.5.4.32

Mar 3, 2026

0.5.4.31

Mar 3, 2026

0.5.4.30

Mar 3, 2026

0.5.4.29

Mar 3, 2026

0.5.4.27

Mar 3, 2026

0.5.4.26

Mar 3, 2026

0.5.4.25

Mar 3, 2026

0.5.4.24

Mar 3, 2026

0.5.4.23

Mar 3, 2026

0.5.4.22

Mar 1, 2026

0.5.4.18

Mar 1, 2026

0.5.4.17

Mar 1, 2026

0.5.4.16

Mar 1, 2026

0.5.4.15

Mar 1, 2026

0.5.4.14

Mar 1, 2026

0.5.4.13

Mar 1, 2026

0.5.4.11

Mar 1, 2026

This version

0.5.4.10

Mar 1, 2026

0.5.4.9

Mar 1, 2026

0.5.4.3

Feb 28, 2026

0.5.4.1

Feb 27, 2026

0.5.4.0

Feb 27, 2026

0.5.3.18

Feb 27, 2026

0.5.3.17

Feb 27, 2026

0.5.3.15

Feb 27, 2026

0.5.3.14

Feb 26, 2026

0.5.3.13

Feb 26, 2026

0.5.3.12

Feb 26, 2026

0.5.3.8

Feb 26, 2026

0.5.3.6

Feb 26, 2026

0.5.3.4

Feb 26, 2026

0.5.3.3

Feb 24, 2026

0.5.3.2

Feb 23, 2026

0.5.3.1

Feb 23, 2026

0.5.3.0

Feb 23, 2026

0.5.2.0

Feb 23, 2026

0.5.1.9

Feb 23, 2026

0.5.1.8

Feb 20, 2026

0.5.1.7

Feb 20, 2026

0.5.1.6

Feb 20, 2026

0.5.1.5

Feb 20, 2026

0.5.1.4

Feb 20, 2026

0.5.1.3

Feb 20, 2026

0.5.1.2

Feb 19, 2026

0.5.1.1

Feb 18, 2026

0.5.1.0

Feb 17, 2026

0.4.2.2

Feb 17, 2026

0.4.2.1

Feb 15, 2026

0.4.2.0

Feb 12, 2026

0.4.1.17

Feb 7, 2026

0.4.1.16

Feb 7, 2026

0.4.1.10

Feb 3, 2026

0.4.1.2

Feb 1, 2026

0.4.1.1

Feb 1, 2026

0.4.1.0

Jan 31, 2026

0.4.0.37

Jan 31, 2026

0.4.0.36

Jan 31, 2026

0.4.0.35

Jan 31, 2026

0.4.0.34

Jan 30, 2026

0.4.0.33

Jan 30, 2026

0.4.0.32

Jan 30, 2026

0.4.0.31

Jan 30, 2026

0.4.0.30

Jan 30, 2026

0.4.0.29

Jan 30, 2026

0.4.0.28

Jan 30, 2026

0.4.0.27

Jan 30, 2026

0.4.0.26

Jan 30, 2026

0.4.0.25

Jan 30, 2026

0.4.0.24

Jan 30, 2026

0.4.0.23

Jan 30, 2026

0.4.0.22

Jan 30, 2026

0.4.0.21

Jan 30, 2026

0.4.0.20

Jan 30, 2026

0.4.0.19

Jan 30, 2026

0.4.0.17

Jan 30, 2026

0.4.0.16

Jan 30, 2026

0.4.0.15

Jan 30, 2026

0.4.0.14

Jan 30, 2026

0.4.0.13

Jan 30, 2026

0.4.0.12

Jan 30, 2026

0.4.0.11

Jan 30, 2026

0.4.0.10

Jan 30, 2026

0.4.0.9

Jan 30, 2026

0.4.0.8

Jan 30, 2026

0.4.0.7

Jan 30, 2026

0.4.0.6

Jan 30, 2026

0.4.0.5

Jan 30, 2026

0.4.0.4

Jan 30, 2026

0.4.0.3

Jan 30, 2026

0.4.0.2

Jan 29, 2026

0.4.0.1

Jan 29, 2026

0.3.1.8

Jan 29, 2026

0.3.1.7

Jan 29, 2026

0.3.1.6

Jan 29, 2026

0.3.1.5

Jan 29, 2026

0.3.1.4

Jan 29, 2026

0.3.1.3

Jan 28, 2026

0.3.1.2

Jan 28, 2026

0.3.1.1

Jan 28, 2026

0.3.1.0

Jan 28, 2026

0.3.0.22

Jan 28, 2026

0.3.0.21

Jan 27, 2026

0.3.0.20

Jan 27, 2026

0.3.0.19

Jan 27, 2026

0.3.0.18

Jan 27, 2026

0.3.0.17

Jan 27, 2026

0.3.0.16

Jan 27, 2026

0.3.0.15

Jan 27, 2026

0.3.0.14

Jan 27, 2026

0.3.0.13

Jan 27, 2026

0.3.0.12

Jan 27, 2026

0.3.0.11

Jan 27, 2026

0.3.0.9

Jan 27, 2026

0.3.0.8

Jan 27, 2026

0.3.0.6

Jan 27, 2026

0.3.0.5

Jan 27, 2026

0.3.0.4

Jan 27, 2026

0.3.0.3

Jan 27, 2026

0.3.0.2

Jan 27, 2026

0.3.0.1

Jan 27, 2026

0.3.0.0

Jan 27, 2026

0.2.3.3

Jan 26, 2026

0.2.3.2

Jan 26, 2026

0.2.3.1

Jan 26, 2026

0.2.3.0

Jan 26, 2026

0.2.2.8

Jan 25, 2026

0.2.2.7

Jan 25, 2026

0.2.2.4

Jan 25, 2026

0.2.2.3

Jan 25, 2026

0.2.2.2

Jan 21, 2026

0.2.2.1

Jan 21, 2026

0.2.2.0

Jan 20, 2026

0.2.1.0

Jan 20, 2026

0.2.0.0

Jan 20, 2026

0.1.0.10

Jan 18, 2026

0.1.0.8

Jan 17, 2026

0.1.0.7

Jan 17, 2026

0.1.0.6

Jan 17, 2026

0.1.0.5

Jan 17, 2026

0.1.0.4

Jan 17, 2026

0.1.0.3

Jan 17, 2026

0.1.0.2

Jan 15, 2026

0.1.0.1

Jan 15, 2026

0.1.0

Jan 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isagellm-0.5.4.10.tar.gz (85.9 kB view details)

Uploaded Mar 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

isagellm-0.5.4.10-py2.py3-none-any.whl (82.6 kB view details)

Uploaded Mar 1, 2026 Python 2Python 3

File details

Details for the file isagellm-0.5.4.10.tar.gz.

File metadata

Download URL: isagellm-0.5.4.10.tar.gz
Upload date: Mar 1, 2026
Size: 85.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for isagellm-0.5.4.10.tar.gz
Algorithm	Hash digest
SHA256	`e84cb3ca725dc53e8fba7992601bd3fbdcbd6bf26126fe9a195bd916786db667`
MD5	`412aab3cb31add1d12d0e203509fd1f5`
BLAKE2b-256	`05f0b558551140c0e9a0f44111758a99382181b1b300fafc2b95613af9e88ad1`

See more details on using hashes here.

File details

Details for the file isagellm-0.5.4.10-py2.py3-none-any.whl.

File metadata

Download URL: isagellm-0.5.4.10-py2.py3-none-any.whl
Upload date: Mar 1, 2026
Size: 82.6 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for isagellm-0.5.4.10-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`dddfcb48235dc2330c687d675776f7af1546e1478fd15e8e73d7b812a5e78bfa`
MD5	`22dededa33b561eeaf1e75561d53ff08`
BLAKE2b-256	`128db1af4856577b6d13ec50a97a1f13b1270e608382b6b46c9b7c21fca5de06`

See more details on using hashes here.

isagellm 0.5.4.10

Navigation

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

sageLLM

Protocol Compliance (Mandatory)

✨ Features

📦 Quick Install

🚀 国内加速安装 PyTorch（推荐）

🚀 Quick Start

CLI 命令统一

CLI (像 vLLM/Ollama 一样简单)

🛡️ 生产健康防护（Canary + Fallback）

Python API (Control Plane - Recommended)

Mixed Inference (LLM + Embedding)

Ollama Backend

Performance Profiling & Interpolation

Configuration

📊 Metrics & Validation

🏗️ Architecture

🔧 Development

Quick Setup (Development Mode)

Testing

Run tests

Format & lint

Type check

Verify dependency hierarchy

2️⃣ 在本地分支开发

3️⃣ 发起 Pull Request

4️⃣ 代码审查与合并

快速检查清单

反面例子 ❌

相关资源

�📚 Package Details

📄 License

Project details

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes