anyServe - Capability-Oriented Serving Runtime for LLM inference

Project description

anyserve

面向大规模 LLM 推理的 Serving Runtime。

项目状态

POC 阶段 - 核心骨架已实现，正在开发 MVP 功能。

核心特性

Capability 驱动：基于任意 key-value 的请求路由，而非固定 model name
Worker 动态启停：根据负载动态管理 Worker，资源灵活复用
控制流/数据流分离：控制流走 KServe 协议，数据流走 Object System
C++ Dispatcher + Python Worker：高性能控制面 + 灵活执行面

架构概览

┌──────────────────────────────────────────┐
│            API Server (独立项目)          │
│         基于 Capability 路由请求          │
└──────────────────────┬───────────────────┘
                       │
          ┌────────────┼────────────┐
          ↓            ↓            ↓
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│  Replica A   │ │  Replica B   │ │  Replica C   │
│  (anyserve)  │ │  (anyserve)  │ │  (anyserve)  │
│              │ │              │ │              │
│ Dispatcher   │ │ Dispatcher   │ │ Dispatcher   │
│     ↓        │ │     ↓        │ │     ↓        │
│  Workers     │ │  Workers     │ │  Workers     │
└──────────────┘ └──────────────┘ └──────────────┘

详细设计请参阅：

架构设计 - 概念、原则、分层
运行时实现 - 代码结构、协议、流程
MVP 计划 - 开发目标和任务列表

快速开始

环境要求

Python 3.11+
C++ 编译器（支持 C++17）
CMake 3.20+
Conan 2.0+

安装

# 安装依赖并构建
just setup
just build

# 安装 Python 包
pip install -e python/

运行示例

# 启动 server
anyserve examples.basic.app:app --port 8000 --workers 1

# 测试
python examples/basic/run_example.py

定义 Capability Handler

from anyserve import AnyServe, ModelInferRequest, ModelInferResponse

app = AnyServe()

@app.capability(type="echo")
def echo_handler(request: ModelInferRequest) -> ModelInferResponse:
    response = ModelInferResponse(
        model_name=request.model_name,
        id=request.id
    )
    for inp in request.inputs:
        out = response.add_output(
            name=f"output_{inp.name}",
            datatype=inp.datatype,
            shape=inp.shape
        )
        out.contents = inp.contents
    return response

Client 连接模式

Client 支持两种连接模式：

from anyserve.worker.client import Client

# Direct 模式 - 直接连接指定 Worker
client = Client(endpoint="localhost:50051")

# Discovery 模式 - 通过 API Server 自动发现 Worker
client = Client(
    api_server="http://localhost:8080",
    capability={"type": "echo"}
)

result = client.infer("echo", {"text": ["hello"]})
client.close()

详见 examples/multi_server/ 示例。

Worker 间调用 (context.call)

Worker 可以通过 context.call() 调用其他服务，构建处理流水线：

@app.capability(type="tokenize")
def handler(request: ModelInferRequest, context: Context) -> ModelInferResponse:
    # 处理输入
    text = request.get_input("text").bytes_contents[0].decode()
    tokens = tokenize(text)

    # 调用其他服务
    result = context.call(
        model_name="analyze",
        capability={"type": "analyze"},  # 通过 API Server 路由
        inputs={"tokens": [",".join(tokens)]}
    )

    return build_response(result)

详见 examples/pipeline/ 示例。

项目结构

anyserve/
├── cpp/                    # C++ Dispatcher
│   └── server/             # 核心组件
├── python/anyserve/        # Python Worker
│   ├── cli.py              # CLI 入口
│   ├── kserve.py           # KServe 协议
│   └── worker/             # Worker 实现
├── proto/                  # 协议定义
├── examples/               # 示例
└── docs/                   # 文档
    ├── architecture.md     # 架构设计
    ├── runtime.md          # 运行时实现
    └── mvp.md              # MVP 计划

开发

just setup    # 安装依赖
just build    # 构建
just clean    # 清理

文档

文档	内容
architecture.md	架构设计、核心概念、设计原则
runtime.md	实现细节、代码结构、协议
mvp.md	MVP 目标、当前状态、开发计划
agents.md	AI 助手协作指南

License

[待定]

Project details

Release history Release notifications | RSS feed

0.1.2

Jan 17, 2026

This version

0.1.1

Jan 17, 2026

0.1.0

Jan 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

anyserve-0.1.1-cp313-cp313-macosx_14_0_arm64.whl (11.9 MB view details)

Uploaded Jan 17, 2026 CPython 3.13macOS 14.0+ ARM64

anyserve-0.1.1-cp312-cp312-macosx_14_0_arm64.whl (11.9 MB view details)

Uploaded Jan 17, 2026 CPython 3.12macOS 14.0+ ARM64

anyserve-0.1.1-cp311-cp311-macosx_14_0_arm64.whl (11.9 MB view details)

Uploaded Jan 17, 2026 CPython 3.11macOS 14.0+ ARM64

File details

Details for the file anyserve-0.1.1-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

Download URL: anyserve-0.1.1-cp313-cp313-macosx_14_0_arm64.whl
Upload date: Jan 17, 2026
Size: 11.9 MB
Tags: CPython 3.13, macOS 14.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for anyserve-0.1.1-cp313-cp313-macosx_14_0_arm64.whl
Algorithm	Hash digest
SHA256	`9b9aa2d0a0f63c1815f772d1830969f133c81a8453b8907f01e6f461e46390b1`
MD5	`354613cd663d9319d26f01427b90caff`
BLAKE2b-256	`00504cc35ec41095b655d3efa1b3597267e57f628c0019491a895f9b472e3112`

See more details on using hashes here.

File details

Details for the file anyserve-0.1.1-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

Download URL: anyserve-0.1.1-cp312-cp312-macosx_14_0_arm64.whl
Upload date: Jan 17, 2026
Size: 11.9 MB
Tags: CPython 3.12, macOS 14.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for anyserve-0.1.1-cp312-cp312-macosx_14_0_arm64.whl
Algorithm	Hash digest
SHA256	`9ec447d90e1aaf083acba77972ef038edea3ec155350d33c81ddd6a371d3e196`
MD5	`8b3236cd1de7a33cd11ea46db4bce07a`
BLAKE2b-256	`611289543ce49c3e20eb19a5d5eb0702f9697590b1cdb32855dd4d5c1b409e7c`

See more details on using hashes here.

File details

Details for the file anyserve-0.1.1-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

Download URL: anyserve-0.1.1-cp311-cp311-macosx_14_0_arm64.whl
Upload date: Jan 17, 2026
Size: 11.9 MB
Tags: CPython 3.11, macOS 14.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for anyserve-0.1.1-cp311-cp311-macosx_14_0_arm64.whl
Algorithm	Hash digest
SHA256	`a5c716db369a778098f2af38ad32aac3e4144a67703fd7d53940952ce52c3891`
MD5	`340d4c5d5451dde2b218ea43a9673873`
BLAKE2b-256	`1b8db01e4f066a554f1373b1d1f93103b9cecb9ad5f8266e97dd505c55b8b948`

See more details on using hashes here.

anyserve 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

anyserve

项目状态

核心特性

架构概览

快速开始

环境要求

安装

运行示例

定义 Capability Handler

Client 连接模式

Worker 间调用 (context.call)

项目结构

开发

文档

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes