轻量级大模型推理工具,专注于模型推理延迟,注重框架易用性和可拓展性。
Project description
简介
osc-llm是一款轻量级别的模型推理框架, 专注于多模态推理的延迟和吞吐量。
特点
- ✅ 延迟低:torch.compile,cuda gragh
- ✅ 吞吐量高:PageAttention
- ✅ 支持多模态推理:llm,tts等
- ✅ 模型量化:WeightOnlyInt8,WeightOnlyInt4
文档地址:
安装
- 安装最新版本pytorch
- 安装flash-attention
- 安装osc-llm:
pip install osc-llm
快速开始
from osc_llm import LLM
llm = LLM(model="checkpoints/Qwen/Qwen3-0.6B")
# 支持批量生成
outputs = llm.generate(prompts=["介绍一下你自己"])
# 支持流式生成
for token in llm.stream(prompt="介绍一下你自己"):
print(token)
模型支持
LLM模型支持:
- Qwen2ForCausalLM: qwen1.5, qwen2等。
- Qwen3ForCausalLM: qwen3等。
TTS模型支持:
- SparkTTS: todo
致敬
本项目参考了大量的开源项目,特别是以下项目:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
osc_llm-0.1.7.tar.gz
(34.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
osc_llm-0.1.7-py3-none-any.whl
(49.1 kB
view details)
File details
Details for the file osc_llm-0.1.7.tar.gz.
File metadata
- Download URL: osc_llm-0.1.7.tar.gz
- Upload date:
- Size: 34.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3b7e46950edc660b32a1929a66c161b9221ea89af6d609a040b5c8ad1edd435
|
|
| MD5 |
e88dd125de33486f93964fec88b57634
|
|
| BLAKE2b-256 |
f585fb03cc403b0b58b007231810c349f9d419bc52009dab17b4b317a1d6178c
|
File details
Details for the file osc_llm-0.1.7-py3-none-any.whl.
File metadata
- Download URL: osc_llm-0.1.7-py3-none-any.whl
- Upload date:
- Size: 49.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b467f25568dc9e86516ee7189e1144e7892eef1813acb7d15c2103c8ef8bf47
|
|
| MD5 |
1118a7d6d580677e8853b384e45b2439
|
|
| BLAKE2b-256 |
0feba8e969f2a7da355cf075fabdf3978933fce729b1a31e2b31b0940140484d
|