A lightweight LLM inference framework
Project description
light-llm-hp - 轻量级 LLM 推理框架
在 CPU 上运行的简化推理框架,支持 REST API 服务。
快速开始
from hllm import HLLM
# 初始化模型
model = HLLM(model_path="microsoft/Phi-3-mini-4k-instruct", device="cpu")
# 生成文本
result = model.generate("Write a short story about a robot.")
print(result)
REST API 服务 (OpenAI 兼容)
安装 API 依赖
pip install light-llm-hp[api]
启动服务
python -m hllm.server --model ./TinyLlama-1.1B-Chat-v1.0 --port 8000
使用 OpenAI 官方客户端
import httpx
from openai import OpenAI
# 禁用代理避免 502 错误
http_client = httpx.Client(trust_env=False)
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="not-needed",
http_client=http_client
)
# 对话
response = client.chat.completions.create(
model="hllm-model",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
完整示例:examples/test_openai_client.py
OpenAI 兼容端点
| 端点 | 方法 | 说明 |
|---|---|---|
/v1/models |
GET | 模型列表 |
/v1/chat/completions |
POST | 对话补全 (支持流式) |
/v1/completions |
POST | 文本补全 (支持流式) |
详细 API 文档见 docs/api.md。
目录结构
hllm/
├── hllm/ # 核心模块
│ ├── __init__.py
│ ├── model.py # 模型加载与推理
│ ├── tokenizer.py # 分词器封装
│ ├── generate.py # 生成逻辑
│ ├── server.py # REST API 服务端
│ └── client.py # REST API 客户端
├── tests/ # 测试
├── examples/ # 示例
└── docs/ # 文档
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
light_llm_hp-0.3.2.tar.gz
(14.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file light_llm_hp-0.3.2.tar.gz.
File metadata
- Download URL: light_llm_hp-0.3.2.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96311cfe6f6ccd4deaaae75b9c15c663a82625f98fee5b9ef1fd6bf3209595fc
|
|
| MD5 |
1b0f3d1b62542c2adce23719e83b4eff
|
|
| BLAKE2b-256 |
ad4f565beb0a41dffa4830211331fed3f2250f50949a58951d3d348ffc9ab2f2
|
File details
Details for the file light_llm_hp-0.3.2-py3-none-any.whl.
File metadata
- Download URL: light_llm_hp-0.3.2-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eee9843a00826214ebc6383f689b18f5139262b8b94f017c45f6d6a3538c5d08
|
|
| MD5 |
1d07a042a0c4a7dc0cb0c454605c0d29
|
|
| BLAKE2b-256 |
0bac0f9685333e372a994c28c861781b691733f19a0ea38ffe3b8fb981d32354
|