SD.Next Quantization Engine
Project description
SDNQ: SD.Next Quantization Engine
For more info, please check out SD.Next SDNQ wiki page: https://github.com/vladmandic/sdnext/wiki/SDNQ-Quantization
Install command:
pip install sdnq
Example code to load pre-quantized models:
Pre-quantized models can be found here: https://huggingface.co/collections/Disty0/sdnq
from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers
model = AutoModel.from_pretrained(model_path)
Example code for enabling or disabling quantized matmul with a pre-quantized model:
from sdnq.loader import apply_sdnq_options_to_model
quantized_model = apply_sdnq_options_to_model(quantized_model, use_quantized_matmul=True)
Example quantization config code for Diffusers and Transformers libraries:
from sdnq import SDNQConfig
from sdnq.common import use_torch_compile as triton_is_available
sdnq_config = SDNQConfig(
weights_dtype="int8",
group_size=0,
svd_rank=32,
svd_steps=8,
dynamic_loss_threshold=1e-2,
use_svd=False,
quant_conv=False,
use_quantized_matmul=triton_is_available,
use_quantized_matmul_conv=False,
use_dynamic_quantization=False,
dequantize_fp32=False,
non_blocking=False,
add_skip_keys=True,
quantization_device="cuda",
return_device="cuda",
modules_to_not_convert=["correction_coefs", "prediction_coefs", "lm_head", "embedding_projection"],
modules_dtype_dict={"int8": ["lm_head"]},
)
quantized_model = AutoModel.from_pretrained(model_path, quantization_config=sdnq_config)
Example code for saving a quantized model:
from sdnq.loader import save_sdnq_model
# set is_pipeline to True if you want to save the entire diffusers pipeline instead of a single model.
save_sdnq_model(pipe_or_quantized_model, "path_to_save_the_quantized_model", is_pipeline=False)
Example code for quantized training:
Note:
- Safetensors serialization is not supported with SDNQ training.
Either don't use Safetensors serialization or convert the quantized model to standard SDNQ model before saving.
from sdnq.training import sdnq_training_post_load_quant
from sdnq.common import use_torch_compile as triton_is_available
quantized_model = sdnq_training_post_load_quant(
model,
weights_dtype="uint8",
quantized_matmul_dtype="int8",
group_size=32, # 0 means auto, -1 means disabled
svd_rank=32,
svd_steps=8,
use_svd=False,
use_grad_ckpt=True, # disable this if you are not using gradient checkpointing
use_quantized_matmul=triton_is_available,
use_static_quantization=True, # quantize the model weights
use_stochastic_rounding=True,
dequantize_fp32=True,
non_blocking=False,
add_skip_keys=True,
quantization_device="cuda",
return_device="cuda",
modules_to_not_convert=["correction_coefs", "prediction_coefs", "lm_head", "embedding_projection"],
modules_dtype_dict={"int8": ["lm_head"]},
)
Example code for converting standard SDNQ model to training SDNQ Model:
from sdnq.training import convert_sdnq_model_to_training
from sdnq.common import use_torch_compile as triton_is_available
quantized_model = convert_sdnq_model_to_training(
quantized_model,
quantized_matmul_dtype="int8",
use_grad_ckpt=True,
use_quantized_matmul=triton_is_available,
use_stochastic_rounding=True,
dequantize_fp32=True,
)
Example code for converting training SDNQ model to standard SDNQ Model:
from sdnq.training import convert_training_model_to_sdnq
quantized_model = convert_training_model_to_sdnq(quantized_model)
Example code for quantized optimizer states:
from sdnq.optim import Adafactor, AdamW, CAME, Lion, Muon
optimizer = AdamW(
parameters,
use_stochastic_rounding=True,
use_stochastic_buffers=True,
use_quantized_buffers=True,
use_svd_quantization=False,
quantized_buffers_dtype="uint8",
quantized_buffers_group_size=32,
quantized_buffers_svd_rank=32,
)
Example code for quantized optimizer states for custom optimizers:
from sdnq.training import SDNQTensor
state["exp_avg"] = SDNQTensor.from_float(torch.zeros_like(p), weights_dtype="uint8", group_size=32, use_stochastic_rounding=True)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sdnq-0.1.4.tar.gz.
File metadata
- Download URL: sdnq-0.1.4.tar.gz
- Upload date:
- Size: 63.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01f42970d0322453f36bb5b59066a62704195d53850ad904a038cbe38b5a5ba5
|
|
| MD5 |
3f1ec913946c809a9922509569470179
|
|
| BLAKE2b-256 |
1da439bf066dd412dd04905ae20064443fe8ba4ac2c56645561c42bfcb18c1de
|
Provenance
The following attestation bundles were made for sdnq-0.1.4.tar.gz:
Publisher:
python-publish.yml on Disty0/sdnq
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sdnq-0.1.4.tar.gz -
Subject digest:
01f42970d0322453f36bb5b59066a62704195d53850ad904a038cbe38b5a5ba5 - Sigstore transparency entry: 834884431
- Sigstore integration time:
-
Permalink:
Disty0/sdnq@dca817906380230984a95a445e5e49c90dacb222 -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/Disty0
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@dca817906380230984a95a445e5e49c90dacb222 -
Trigger Event:
release
-
Statement type:
File details
Details for the file sdnq-0.1.4-py3-none-any.whl.
File metadata
- Download URL: sdnq-0.1.4-py3-none-any.whl
- Upload date:
- Size: 96.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b6ea49a1e56877525fcbe27bf0146750bf87173de1482f58b49328348ee0f647
|
|
| MD5 |
c3daab38a01e400426a570e5dd8eb88a
|
|
| BLAKE2b-256 |
9cc8ae842f46986a62467e949a0731ea84fa6ba9bde91f4744e4bdedef2aa64b
|
Provenance
The following attestation bundles were made for sdnq-0.1.4-py3-none-any.whl:
Publisher:
python-publish.yml on Disty0/sdnq
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sdnq-0.1.4-py3-none-any.whl -
Subject digest:
b6ea49a1e56877525fcbe27bf0146750bf87173de1482f58b49328348ee0f647 - Sigstore transparency entry: 834884434
- Sigstore integration time:
-
Permalink:
Disty0/sdnq@dca817906380230984a95a445e5e49c90dacb222 -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/Disty0
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@dca817906380230984a95a445e5e49c90dacb222 -
Trigger Event:
release
-
Statement type: