轻量级大模型推理工具,专注于模型推理延迟,注重框架易用性和可拓展性。
Project description
简介
osc-llm是一款轻量级别的模型推理框架, 专注于多模态推理的延迟和吞吐量。
特点
- ✅ 延迟低:torch.compile,cuda gragh
- ✅ 吞吐量高:PageAttention
- ✅ 支持多模态推理:llm,tts等
- ✅ 模型量化:WeightOnlyInt8,WeightOnlyInt4
文档地址:
安装
- 安装最新版本pytorch
- 安装flash-attention
- 安装osc-llm:
pip install osc-llm
快速开始
from osc_llm import LLM
llm = LLM(model="checkpoints/Qwen/Qwen3-0.6B")
# 支持批量生成
outputs = llm.generate(prompts=["介绍一下你自己"])
# 支持流式生成
for token in llm.stream(prompt="介绍一下你自己"):
print(token)
模型支持
LLM模型支持:
- Qwen2ForCausalLM: qwen1.5, qwen2等。
- Qwen3ForCausalLM: qwen3等。
TTS模型支持:
- SparkTTS: todo
致敬
本项目参考了大量的开源项目,特别是以下项目:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
osc_llm-0.1.8.tar.gz
(11.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
osc_llm-0.1.8-py3-none-any.whl
(14.8 kB
view details)
File details
Details for the file osc_llm-0.1.8.tar.gz.
File metadata
- Download URL: osc_llm-0.1.8.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fcb6f29ad48cea09d4d3741707338e9de0e9954212a90a10630ebde8feca9e74
|
|
| MD5 |
a5366d5acedcf33a06d39f59086c80be
|
|
| BLAKE2b-256 |
d4c2de4eec23405c6f5c63e2ba1093531e2769db8c4f7c36b8723d9de0f826db
|
File details
Details for the file osc_llm-0.1.8-py3-none-any.whl.
File metadata
- Download URL: osc_llm-0.1.8-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7fbe7b90d0e01cd7df7ff953edbe73dbdfb8e264e1b5c87719c90267bd3c7a68
|
|
| MD5 |
7afdc121aef8e00ae365e4ca993a1f09
|
|
| BLAKE2b-256 |
a4f77dd47dfbd7920fa21ebb46b68100029b66be29f5f64fe9f05244339ff17b
|