Skip to main content

llmexport: A toolkit to export llm to onnx or mnn.

Project description

llm-export

English

llm-export是一个llm模型导出工具,能够将llm模型导出为onnx和mnn模型。

  • 🚀 优化原始代码,支持动态形状
  • 🚀 优化原始代码,减少常量部分
  • 🚀 使用OnnxSlim优化onnx模型,性能提升约5%; by @inisis
  • 🚀 支持将lora权重导出为onnx和mnn
  • 🚀 Onnx推理代码OnnxLLM

安装

# pip install
pip install llmexport

# git install
pip install git+https://github.com/wangzhaode/llm-export@master

# local install
git clone https://github.com/wangzhaode/llm-export && cd llm-export/
pip install .

用法

  1. 下载模型
git clone https://huggingface.co/Qwen/Qwen2-1.5B-Instruct
# 如果huggingface下载慢可以使用modelscope
git clone https://modelscope.cn/qwen/Qwen2-1.5B-Instruct.git
  1. 测试模型
# 测试文本输入
llmexport --path Qwen2-1.5B-Instruct --test "你好"
# 测试图像文本
llmexport --path Qwen2-VL-2B-Instruct  --test "<img>https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg</img>介绍一下图片里的内容"
  1. 导出模型
# 将Qwen2-1.5B-Instruct导出为onnx模型
llmexport --path Qwen2-1.5B-Instruct --export onnx
# 将Qwen2-1.5B-Instruct导出为mnn模型, 量化参数为4bit, blokc-wise = 128
llmexport --path Qwen2-1.5B-Instruct --export mnn --quant_bit 4 --quant_block 128

功能

  • 支持将模型为onnx或mnn模型,使用--export onnx--export mnn
  • 支持对模型进行对话测试,使用--test $query会返回llm的回复内容
  • 默认会使用onnx-slim对onnx模型进行优化,跳过该步骤使用--skip_slim
  • 支持合并lora权重后导出,指定lora权重的目录使用--lora_path
  • 制定量化bit数使用--quant_bit;量化的block大小使用--quant_block
  • 使用--lm_quant_bit来制定lm_head层权重的量化bit数,不指定则使用--quant_bit的量化bit数
  • 支持使用自己编译的MNNConvert,使用--mnnconvert

参数

usage: llmexport.py [-h] --path PATH [--type TYPE] [--lora_path LORA_PATH] [--dst_path DST_PATH] [--test TEST] [--export EXPORT]
                    [--skip_slim] [--quant_bit QUANT_BIT] [--quant_block QUANT_BLOCK] [--lm_quant_bit LM_QUANT_BIT]
                    [--mnnconvert MNNCONVERT]

llm_exporter

options:
  -h, --help            show this help message and exit
  --path PATH           path(`str` or `os.PathLike`):
                        Can be either:
                        	- A string, the *model id* of a pretrained model like `THUDM/chatglm-6b`. [TODO]
                        	- A path to a *directory* clone from repo like `../chatglm-6b`.
  --type TYPE           type(`str`, *optional*):
                        	The pretrain llm model type.
  --lora_path LORA_PATH
                        lora path, defaut is `None` mean not apply lora.
  --dst_path DST_PATH   export onnx/mnn model to path, defaut is `./model`.
  --test TEST           test model inference with query `TEST`.
  --export EXPORT       export model to an onnx/mnn model.
  --skip_slim           Whether or not to skip onnx-slim.
  --quant_bit QUANT_BIT
                        mnn quant bit, 4 or 8, default is 4.
  --quant_block QUANT_BLOCK
                        mnn quant block, default is 0 mean channle-wise.
  --lm_quant_bit LM_QUANT_BIT
                        mnn lm_head quant bit, 4 or 8, default is `quant_bit`.
  --mnnconvert MNNCONVERT
                        local mnnconvert path, if invalid, using pymnn.

支持模型

  • llama/llama2/llama3/llama3.2/tinyllama
  • qwen/qwen1.5/qwen2/qwen-vl/qwen2-vl/qwen2.5
  • baichuan2/phi-2/internlm/yi/deepseek
  • chatglm/codegeex/chatglm2/chatglm3
  • phi-2/gemma-2

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmexport-0.0.2.tar.gz (26.3 kB view details)

Uploaded Source

Built Distribution

llmexport-0.0.2-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file llmexport-0.0.2.tar.gz.

File metadata

  • Download URL: llmexport-0.0.2.tar.gz
  • Upload date:
  • Size: 26.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for llmexport-0.0.2.tar.gz
Algorithm Hash digest
SHA256 b8713b667d8bb9a55e6a984bef027b11eb34ad5aecb87208ba3c783db6d858ee
MD5 a02f0e5da103c4ddb55ec63c221009ad
BLAKE2b-256 e0dbbd0220f208a68a1892c64312e9648764f15ca54dddee5a60955789af7782

See more details on using hashes here.

File details

Details for the file llmexport-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: llmexport-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for llmexport-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e6fb8814d19b118077d2af5929d2a5536463c4576e82dcdcf93a859db37da607
MD5 26d1d484e82e309882a898b78bc4f37f
BLAKE2b-256 8d970e8847262f1a4e674e901a8f6616fcafb9d276c4d87fc39be04a5ab7eced

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page