cli tool for downloading and quantizing LLMs
Project description
quantkit
A tool for downloading and converting HuggingFace models without drama.
Install
pip3 install llm-quantkit
Usage
Usage: quantkit [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
download Download model from huggingface.
safetensor Download and/or convert a pytorch model to safetensor format.
awq Download and/or convert a model to AWQ format.
exl2 Download and/or convert a model to EXL2 format.
gptq Download and/or convert a model to GPTQ format.
Download a model from HF and don't use HF cache:
quantkit download teknium/Hermes-Trismegistus-Mistral-7B --no-cache
Only download the safetensors version of a model (useful for models that have torch and safetensor):
quantkit download mistralai/Mistral-7B-v0.1 --no-cache --safetensors-only -out mistral7b
Download and convert a model to safetensor, deleting the original pytorch bins:
quantkit safetensor migtissera/Tess-10.7B-v1.5b --delete-original
Download and convert a model to AWQ:
quantkit awq mistralai/Mistral-7B-v0.1 -out Mistral-7B-v0.1-AWQ
Convert a model to GPTQ (4 bits / group-size 32):
quantkit gptq mistral7b -out Mistral-7B-v0.1-GPTQ -b 4 --group-size 32
Convert a model to exllamav2:
quantkit exl2 mistralai/Mistral-7B-v0.1 -out Mistral-7B-v0.1-exl2-b8-h8 -b 8 -hb 8
Still in beta.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
llm-quantkit-0.16.tar.gz
(6.7 kB
view hashes)
Built Distribution
Close
Hashes for llm_quantkit-0.16-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 32b22671a9a8cbb1cb7cf4324fff7e85c89ba86011887b0be52ba43c594d5be2 |
|
MD5 | 40b6b4fdd350721b6e3b60cffa391e0b |
|
BLAKE2b-256 | 360d43ba00234916c0f6cd960d6adc36f43bd0f3f7e665e03a5a881ca53dfa21 |