Convert safetensors weights to quantized formats (FP8, INT8) with learned rounding optimization
Project description
convert_to_quant
Convert safetensors weights to quantized formats (FP8, INT8, NVFP4, MXFP8) with learned rounding optimization for ComfyUI inference.
Installation
pip install convert_to_quant
Or install from source:
git clone https://github.com/silveroxides/convert_to_quant.git
cd convert_to_quant
pip install -e .
Requirements Summary
| Feature | Requirement |
|---|---|
| Minimum (FP8/INT8) | Python 3.10+, PyTorch 2.8+, CUDA 12.8+ |
| Full (NVFP4/MXFP8) | Python 3.12+, PyTorch 2.10+, CUDA 13.0+, comfy-kitchen |
| INT8 Kernels | Triton (Linux native, Windows via triton-windows) |
[!IMPORTANT] PyTorch must be installed manually with the correct CUDA version for your GPU. This package does not install PyTorch automatically to prevent environment conflicts.
Detailed Installation (GPU-Specific)
1. Install PyTorch
Visit pytorch.org to get the correct install command.
Examples:
# CUDA 13.0 (Required for Blackwell NVFP4/MXFP8)
pip install torch --index-url https://download.pytorch.org/whl/cu130
# CUDA 12.8 (Stable)
pip install torch --index-url https://download.pytorch.org/whl/cu128
# CPU only
pip install torch --index-url https://download.pytorch.org/whl/cpu
2. Optional: Triton (needed for blockwise INT8)
# Linux
pip install -U triton
# Windows (Example for torch>=2.9)
pip install -U "triton-windows<3.6"
Quick Start
# Basic FP8 quantization with ComfyUI metadata (recommended)
convert_to_quant -i model.safetensors --comfy_quant
# INT8 Block-wise with SVD optimization
convert_to_quant -i model.safetensors --int8 --block_size 128 --comfy_quant
# Blackwell NVFP4 (4-bit)
convert_to_quant -i model.safetensors --nvfp4 --comfy_quant
Load the output .safetensors file in ComfyUI like any other model.
Supported Quantization Formats
| Format | CLI Flag | Hardware | Optimization |
|---|---|---|---|
| FP8 (E4M3) | (default) | Ada/Hopper+ | Learned Rounding (SVD) |
| INT8 Block-wise | --int8 |
Any GPU | Learned Rounding (SVD) |
| INT8 Tensor-wise | --int8 --scaling_mode tensor |
Any GPU | High-perf _scaled_mm |
| NVFP4 (4-bit) | --nvfp4 |
Blackwell | Dual-scale optimization |
| MXFP8 | --mxfp8 |
Blackwell | Microscaling (E8M0) |
For a deep dive into how these formats work, see FORMATS.md.
Model-Specific Presets
| Model | Flag | Notes |
|---|---|---|
| Flux.2 | --flux2 |
Keep modulation/guidance/time/final high-precision |
| T5-XXL | --t5xxl |
Decoder removed |
| Hunyuan Video | --hunyuan |
Attention norms excluded |
| WAN Video | --wan |
Time embeddings excluded |
(See --help-filters for a full list of presets)
Documentation
- 📖 MANUAL.md - Complete usage guide with examples and troubleshooting
- 📚 FORMATS.md - Technical reference for quantization formats
- 🧪 DEVELOPMENT.md - Changelog and research notes
- 📋 AGENTS.md - Developer guide & registry architecture
Key Features
- Learned Rounding: SVD-based optimization minimizes quantization error.
- Bias Correction: Automatic bias adjustment using synthetic calibration data.
- Model-Specific Support: Exclusion lists for sensitive layers (norms, embeddings).
- Three-Tier Quantization: Mix different formats per layer using
--custom-layers.
Advanced Usage
Layer Config JSON
Define per-layer settings with regex patterns:
convert_to_quant -i model.safetensors --layer-config layers.json --comfy_quant
Scaling Modes
# Block-wise scaling for better accuracy
convert_to_quant -i model.safetensors --scaling-mode block --block_size 64 --comfy_quant
Acknowledgements
Special thanks to:
- Clybius – For Learned-Rounding inspiration.
- lyogavin – For ComfyUI
int8_blockwisesupport.
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file convert_to_quant-1.0.2.tar.gz.
File metadata
- Download URL: convert_to_quant-1.0.2.tar.gz
- Upload date:
- Size: 115.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62407c65e3fbd726df3177a15c8e6601e1cede8743fbf88c6726a08366dcf35e
|
|
| MD5 |
8a948f97b93745d36c888461a5177d73
|
|
| BLAKE2b-256 |
b33798b4b207755a2e90a258b387101808d32964efe020e253a53d78de3b6682
|
File details
Details for the file convert_to_quant-1.0.2-py3-none-any.whl.
File metadata
- Download URL: convert_to_quant-1.0.2-py3-none-any.whl
- Upload date:
- Size: 132.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1694ead6ea9c80889e8be6f0cc6e577a30f0d396c36b7c29a828d8ca38ff47bc
|
|
| MD5 |
eb081905815a3d5e1bc4dd8b734df52f
|
|
| BLAKE2b-256 |
5b0d11fff7dbeab12ccac0e50cda20d02e6314eeae972e765e5069b92d3cbe40
|