Skip to main content

Generate a llama-quantize command to copy the quantization parameters of any GGUF

Project description

quant_clone

This is a simple little script to help you generate a llama-quantize (from llama.cpp) command which will allow you to quantize your own GGUF the same way your target GGUF has been quantized.

Installation

git clone https://github.com/electroglyph/quant_clone.git

cd quant_clone

pip install .

Usage

quant_clone input.gguf output.txt

input.gguf is the GGUF file whose quantization parameters you would like to copy

output.txt parameter is optional, if it's omitted the output will be saved to cmd.txt

Example

if I take one of unsloth's dynamic 2.0 quants and run:

quant_clone gemma-3-1b-it-UD-IQ1_S.gguf

I get this output:

llama-quantize --imatrix <imatrix_unsloth.dat> --tensor-type token_embd.weight=Q5_1 --tensor-type blk.0.attn_k.weight=IQ4_NL --tensor-type blk.0.attn_output.weight=IQ2_XXS --tensor-type blk.0.attn_q.weight=IQ4_NL --tensor-type blk.0.attn_v.weight=Q5_0 --tensor-type blk.0.ffn_down.weight=IQ3_S --tensor-type blk.0.ffn_gate.weight=IQ4_NL --tensor-type blk.0.ffn_up.weight=IQ4_NL --tensor-type blk.1.attn_k.weight=IQ4_NL --tensor-type blk.1.attn_output.weight=IQ2_XXS --tensor-type blk.1.attn_q.weight=IQ4_NL --tensor-type blk.1.attn_v.weight=Q5_0 --tensor-type blk.1.ffn_down.weight=Q2_K --tensor-type blk.1.ffn_gate.weight=IQ4_NL --tensor-type blk.1.ffn_up.weight=IQ4_NL --tensor-type blk.2.attn_k.weight=IQ4_NL --tensor-type blk.2.attn_output.weight=IQ2_XXS --tensor-type blk.2.attn_q.weight=IQ4_NL --tensor-type blk.2.attn_v.weight=Q5_0 --tensor-type blk.2.ffn_down.weight=IQ3_S --tensor-type blk.2.ffn_gate.weight=IQ4_NL --tensor-type blk.2.ffn_up.weight=IQ4_NL --tensor-type blk.3.attn_k.weight=IQ4_NL --tensor-type blk.3.attn_output.weight=IQ2_XXS --tensor-type blk.3.attn_q.weight=IQ4_NL --tensor-type blk.3.attn_v.weight=Q5_0 --tensor-type blk.3.ffn_down.weight=IQ3_S --tensor-type blk.3.ffn_gate.weight=IQ4_NL --tensor-type blk.3.ffn_up.weight=IQ4_NL --tensor-type blk.4.attn_k.weight=IQ4_NL --tensor-type blk.4.attn_output.weight=IQ2_XXS --tensor-type blk.4.attn_q.weight=IQ4_NL --tensor-type blk.4.attn_v.weight=Q5_0 --tensor-type blk.4.ffn_down.weight=IQ3_S --tensor-type blk.4.ffn_gate.weight=IQ4_NL --tensor-type blk.4.ffn_up.weight=IQ4_NL --tensor-type blk.5.attn_k.weight=IQ4_NL --tensor-type blk.5.attn_output.weight=IQ2_XXS --tensor-type blk.5.attn_q.weight=IQ4_NL --tensor-type blk.5.attn_v.weight=Q5_0 --tensor-type blk.5.ffn_down.weight=IQ1_S --tensor-type blk.5.ffn_gate.weight=IQ4_NL --tensor-type blk.5.ffn_up.weight=IQ4_NL --tensor-type blk.6.attn_k.weight=IQ4_NL --tensor-type blk.6.attn_output.weight=IQ2_XXS --tensor-type blk.6.attn_q.weight=IQ4_NL --tensor-type blk.6.attn_v.weight=Q5_0 --tensor-type blk.6.ffn_down.weight=IQ1_S --tensor-type blk.6.ffn_gate.weight=IQ4_NL --tensor-type blk.6.ffn_up.weight=IQ4_NL --tensor-type blk.7.attn_k.weight=IQ4_NL --tensor-type blk.7.attn_output.weight=IQ2_XXS --tensor-type blk.7.attn_q.weight=IQ4_NL --tensor-type blk.7.attn_v.weight=Q5_0 --tensor-type blk.7.ffn_down.weight=IQ1_S --tensor-type blk.7.ffn_gate.weight=IQ4_NL --tensor-type blk.7.ffn_up.weight=IQ4_NL --tensor-type blk.8.attn_k.weight=IQ4_NL --tensor-type blk.8.attn_output.weight=IQ2_XXS --tensor-type blk.8.attn_q.weight=IQ4_NL --tensor-type blk.8.attn_v.weight=Q5_0 --tensor-type blk.8.ffn_down.weight=IQ1_S --tensor-type blk.8.ffn_gate.weight=IQ4_NL --tensor-type blk.8.ffn_up.weight=IQ4_NL --tensor-type blk.9.attn_k.weight=IQ4_NL --tensor-type blk.9.attn_output.weight=IQ2_XXS --tensor-type blk.9.attn_q.weight=IQ4_NL --tensor-type blk.9.attn_v.weight=Q5_0 --tensor-type blk.9.ffn_down.weight=IQ1_S --tensor-type blk.9.ffn_gate.weight=IQ4_NL --tensor-type blk.9.ffn_up.weight=IQ4_NL --tensor-type blk.10.attn_k.weight=IQ4_NL --tensor-type blk.10.attn_output.weight=IQ2_XXS --tensor-type blk.10.attn_q.weight=IQ4_NL --tensor-type blk.10.attn_v.weight=Q5_0 --tensor-type blk.10.ffn_down.weight=IQ1_S --tensor-type blk.10.ffn_gate.weight=IQ4_NL --tensor-type blk.10.ffn_up.weight=IQ4_NL --tensor-type blk.11.attn_k.weight=IQ4_NL --tensor-type blk.11.attn_output.weight=IQ2_XXS --tensor-type blk.11.attn_q.weight=IQ4_NL --tensor-type blk.11.attn_v.weight=Q5_0 --tensor-type blk.11.ffn_down.weight=IQ2_S --tensor-type blk.11.ffn_gate.weight=IQ4_NL --tensor-type blk.11.ffn_up.weight=IQ4_NL --tensor-type blk.12.attn_k.weight=IQ4_NL --tensor-type blk.12.attn_output.weight=IQ2_XXS --tensor-type blk.12.attn_q.weight=IQ4_NL --tensor-type blk.12.attn_v.weight=Q5_0 --tensor-type blk.12.ffn_down.weight=IQ2_S --tensor-type blk.12.ffn_gate.weight=IQ4_NL --tensor-type blk.12.ffn_up.weight=IQ4_NL --tensor-type blk.13.attn_k.weight=IQ4_NL --tensor-type blk.13.attn_output.weight=IQ2_XXS --tensor-type blk.13.attn_q.weight=IQ4_NL --tensor-type blk.13.attn_v.weight=Q5_0 --tensor-type blk.13.ffn_down.weight=IQ2_S --tensor-type blk.13.ffn_gate.weight=IQ4_NL --tensor-type blk.13.ffn_up.weight=IQ4_NL --tensor-type blk.14.attn_k.weight=IQ4_NL --tensor-type blk.14.attn_output.weight=IQ2_XXS --tensor-type blk.14.attn_q.weight=IQ4_NL --tensor-type blk.14.attn_v.weight=Q5_0 --tensor-type blk.14.ffn_down.weight=IQ2_S --tensor-type blk.14.ffn_gate.weight=IQ4_NL --tensor-type blk.14.ffn_up.weight=IQ4_NL --tensor-type blk.15.attn_k.weight=IQ4_NL --tensor-type blk.15.attn_output.weight=IQ2_XXS --tensor-type blk.15.attn_q.weight=IQ4_NL --tensor-type blk.15.attn_v.weight=Q5_0 --tensor-type blk.15.ffn_down.weight=IQ2_S --tensor-type blk.15.ffn_gate.weight=IQ4_NL --tensor-type blk.15.ffn_up.weight=IQ4_NL --tensor-type blk.16.attn_k.weight=IQ4_NL --tensor-type blk.16.attn_output.weight=IQ2_XXS --tensor-type blk.16.attn_q.weight=IQ4_NL --tensor-type blk.16.attn_v.weight=Q5_0 --tensor-type blk.16.ffn_down.weight=IQ1_S --tensor-type blk.16.ffn_gate.weight=IQ4_NL --tensor-type blk.16.ffn_up.weight=IQ4_NL --tensor-type blk.17.attn_k.weight=IQ4_NL --tensor-type blk.17.attn_output.weight=IQ2_XXS --tensor-type blk.17.attn_q.weight=IQ4_NL --tensor-type blk.17.attn_v.weight=Q5_0 --tensor-type blk.17.ffn_down.weight=IQ1_S --tensor-type blk.17.ffn_gate.weight=IQ4_NL --tensor-type blk.17.ffn_up.weight=IQ4_NL --tensor-type blk.18.attn_k.weight=IQ4_NL --tensor-type blk.18.attn_output.weight=IQ2_XXS --tensor-type blk.18.attn_q.weight=IQ4_NL --tensor-type blk.18.attn_v.weight=Q5_0 --tensor-type blk.18.ffn_down.weight=IQ1_S --tensor-type blk.18.ffn_gate.weight=IQ4_NL --tensor-type blk.18.ffn_up.weight=IQ4_NL --tensor-type blk.19.attn_k.weight=IQ4_NL --tensor-type blk.19.attn_output.weight=IQ2_XXS --tensor-type blk.19.attn_q.weight=IQ4_NL --tensor-type blk.19.attn_v.weight=Q5_0 --tensor-type blk.19.ffn_down.weight=IQ1_S --tensor-type blk.19.ffn_gate.weight=IQ4_NL --tensor-type blk.19.ffn_up.weight=IQ4_NL --tensor-type blk.20.attn_k.weight=IQ4_NL --tensor-type blk.20.attn_output.weight=IQ2_XXS --tensor-type blk.20.attn_q.weight=IQ4_NL --tensor-type blk.20.attn_v.weight=Q5_0 --tensor-type blk.20.ffn_down.weight=IQ1_S --tensor-type blk.20.ffn_gate.weight=IQ4_NL --tensor-type blk.20.ffn_up.weight=IQ4_NL --tensor-type blk.21.attn_k.weight=IQ4_NL --tensor-type blk.21.attn_output.weight=IQ2_XXS --tensor-type blk.21.attn_q.weight=IQ4_NL --tensor-type blk.21.attn_v.weight=Q5_0 --tensor-type blk.21.ffn_down.weight=IQ1_S --tensor-type blk.21.ffn_gate.weight=IQ4_NL --tensor-type blk.21.ffn_up.weight=IQ4_NL --tensor-type blk.22.attn_k.weight=IQ4_NL --tensor-type blk.22.attn_output.weight=IQ2_XXS --tensor-type blk.22.attn_q.weight=IQ4_NL --tensor-type blk.22.attn_v.weight=Q5_0 --tensor-type blk.22.ffn_down.weight=IQ1_S --tensor-type blk.22.ffn_gate.weight=IQ4_NL --tensor-type blk.22.ffn_up.weight=IQ4_NL --tensor-type blk.23.attn_k.weight=IQ4_NL --tensor-type blk.23.attn_output.weight=IQ2_XXS --tensor-type blk.23.attn_q.weight=IQ4_NL --tensor-type blk.23.attn_v.weight=Q5_0 --tensor-type blk.23.ffn_down.weight=IQ1_S --tensor-type blk.23.ffn_gate.weight=IQ4_NL --tensor-type blk.23.ffn_up.weight=IQ4_NL --tensor-type blk.24.attn_k.weight=IQ4_NL --tensor-type blk.24.attn_output.weight=IQ2_XXS --tensor-type blk.24.attn_q.weight=IQ4_NL --tensor-type blk.24.attn_v.weight=Q5_0 --tensor-type blk.24.ffn_down.weight=IQ1_S --tensor-type blk.24.ffn_gate.weight=IQ4_NL --tensor-type blk.24.ffn_up.weight=IQ4_NL --tensor-type blk.25.attn_k.weight=IQ4_NL --tensor-type blk.25.attn_output.weight=IQ2_XXS --tensor-type blk.25.attn_q.weight=IQ4_NL --tensor-type blk.25.attn_v.weight=Q5_0 --tensor-type blk.25.ffn_down.weight=IQ3_S --tensor-type blk.25.ffn_gate.weight=IQ4_NL --tensor-type blk.25.ffn_up.weight=IQ4_NL <input.gguf> <output.gguf> Q8_0

That's the command to run to replicate the quantization. Make sure to edit imatrix path, input gguf path, and output gguf path.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quant_clone-0.1.0.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

quant_clone-0.1.0-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file quant_clone-0.1.0.tar.gz.

File metadata

  • Download URL: quant_clone-0.1.0.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for quant_clone-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4fec50d0610625538a62617ab3b0be3a9f3d4b8a5c9b009e2f0d4c25a3f907b6
MD5 310f2aa5a20bcf3505563c7e3b7cd73a
BLAKE2b-256 72bb14d2b9419661d401d8c1fde682162c039c1d66a45a6d186851b1f7d58198

See more details on using hashes here.

File details

Details for the file quant_clone-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: quant_clone-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for quant_clone-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a445861063f39331bd0300060ab73553473de66354c4f52bceefda3a7af66c1f
MD5 926c1c21de7de55cbfa6f42add2eb7f5
BLAKE2b-256 f29595e90aad3f71686dd0768e0285a8d2f6d4a2c85b25dff5d2564758ebf742

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page