Skip to main content

轻量级大模型推理工具,专注于模型推理延迟,注重框架易用性和可拓展性。

Project description

OSC-LLM

PyTorch Lightning

简介

osc-llm是一款轻量级别的模型推理框架, 专注于多模态推理的延迟和吞吐量。

特点

  • ✅ 延迟低:torch.compile,cuda gragh
  • ✅ 吞吐量高:PageAttention
  • ✅ 支持多模态推理:llm,tts等
  • ✅ 模型量化:WeightOnlyInt8,WeightOnlyInt4

文档地址:

安装

快速开始

from osc_llm import LLM

llm = LLM(model="checkpoints/Qwen/Qwen3-0.6B")
# 支持批量生成
outputs = llm.generate(prompts=["介绍一下你自己"])
# 支持流式生成
for token in llm.stream(prompt="介绍一下你自己"):
    print(token)

模型支持

LLM模型支持:

  • Qwen2ForCausalLM: qwen1.5, qwen2等。
  • Qwen3ForCausalLM: qwen3等。

TTS模型支持:

  • SparkTTS: todo

致敬

本项目参考了大量的开源项目,特别是以下项目:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

osc_llm-0.1.8.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

osc_llm-0.1.8-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file osc_llm-0.1.8.tar.gz.

File metadata

  • Download URL: osc_llm-0.1.8.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for osc_llm-0.1.8.tar.gz
Algorithm Hash digest
SHA256 fcb6f29ad48cea09d4d3741707338e9de0e9954212a90a10630ebde8feca9e74
MD5 a5366d5acedcf33a06d39f59086c80be
BLAKE2b-256 d4c2de4eec23405c6f5c63e2ba1093531e2769db8c4f7c36b8723d9de0f826db

See more details on using hashes here.

File details

Details for the file osc_llm-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: osc_llm-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for osc_llm-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 7fbe7b90d0e01cd7df7ff953edbe73dbdfb8e264e1b5c87719c90267bd3c7a68
MD5 7afdc121aef8e00ae365e4ca993a1f09
BLAKE2b-256 a4f77dd47dfbd7920fa21ebb46b68100029b66be29f5f64fe9f05244339ff17b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page