XSlim is an offline quantization tools based on PPQ
Project description
XSlim
中文版 | English
XSlim is a Post-Training Quantization (PTQ) tool developed by SpacemiT. It integrates chip-optimized quantization strategies and provides a unified interface for ONNX model quantization via JSON configuration files.
Features
- INT8 / FP16 / Dynamic Quantization – multiple precision levels for different deployment scenarios
- JSON-driven configuration – simple, declarative quantization setup
- Python API & CLI – use as a library or from the command line
- Custom preprocessing – plug in your own preprocessing functions
- ONNX-based workflow – built on the ONNX ecosystem
Installation
pip install xslim
Or install from source:
git clone https://github.com/spacemit-com/xslim.git
cd xslim
pip install -r requirements.txt
Quick Start
Python API
import xslim
# Using a JSON config file
xslim.quantize_onnx_model("config.json")
# Using a dict
config = {
"model_parameters": {
"onnx_model": "model.onnx",
"working_dir": "./output"
},
"calibration_parameters": {
"input_parametres": [{
"mean_value": [123.675, 116.28, 103.53],
"std_value": [58.395, 57.12, 57.375],
"color_format": "rgb",
"preprocess_file": "PT_IMAGENET",
"data_list_path": "./calib_img_list.txt"
}]
}
}
xslim.quantize_onnx_model(config)
# You can also pass the model path and output path directly
xslim.quantize_onnx_model("config.json", "input.onnx", "output.onnx")
Command Line
# INT8 quantization with a JSON config
python -m xslim --config config.json
# Specify input and output model paths
python -m xslim -c config.json -i input.onnx -o output.onnx
# Dynamic quantization (no config file needed)
python -m xslim -i input.onnx -o output.onnx --dynq
# FP16 conversion (no config file needed)
python -m xslim -i input.onnx -o output.onnx --fp16
# ONNX simplification only (no config file needed)
python -m xslim -i input.onnx -o output.onnx
Configuration Reference
Quantization is configured through a JSON file with three main sections: model_parameters, calibration_parameters, and quantization_parameters. All fields below are optional unless noted otherwise.
model_parameters
| Field | Default | Description |
|---|---|---|
onnx_model |
— | Path to the input ONNX model |
output_prefix |
Model filename (output ends with .q.onnx) |
Output file prefix |
working_dir |
Directory of onnx_model |
Output and working directory |
skip_onnxsim |
false |
Skip ONNX simplification |
calibration_parameters
| Field | Default | Options | Description |
|---|---|---|---|
calibration_step |
100 |
— | Max number of calibration samples (recommended 100–1000) |
calibration_device |
cuda |
cuda, cpu |
Auto-detected; falls back to cpu |
calibration_type |
default |
default, kl, minmax, percentile, mse |
Calibration observer algorithm |
input_parametres |
— | — | List of per-input settings (see below) |
input_parametres (per input)
| Field | Default | Options | Description |
|---|---|---|---|
input_name |
Read from ONNX model | — | Input tensor name |
input_shape |
Read from ONNX model | — | Input shape (symbolic batch dim defaults to 1) |
dtype |
Read from ONNX model | float32, int8, uint8, int16 |
Data type |
file_type |
img |
img, npy, raw |
Calibration file type |
color_format |
bgr |
rgb, bgr |
Image color format |
mean_value |
None |
— | Per-channel mean for normalization |
std_value |
None |
— | Per-channel std for normalization |
preprocess_file |
None |
PT_IMAGENET, IMAGENET, or custom path |
Preprocessing function (see below) |
data_list_path |
required | — | Path to calibration file list |
quantization_parameters
| Field | Default | Options | Description |
|---|---|---|---|
precision_level |
0 |
0, 1, 2, 3, 4 |
See precision levels below |
finetune_level |
1 |
0, 1, 2, 3 |
See fine-tune levels below |
analysis_enable |
true |
— | Enable post-quantization analysis |
max_percentile |
0.9999 |
≥ 0.99 |
Percentile clipping range |
custom_setting |
None |
— | Per-subgraph overrides (list) |
truncate_var_names |
[] |
— | Tensor names to split the graph |
ignore_op_types |
[] |
— | Op types to skip during quantization |
ignore_op_names |
[] |
— | Op names to skip during quantization |
Precision Levels
| Level | Description |
|---|---|
| 0 | Full INT8 quantization (default) |
| 1 | Partial INT8, suitable for general Transformer models |
| 2 | Partial INT8 with highest precision |
| 3 | Dynamic quantization |
| 4 | FP16 conversion |
Fine-tune Levels
| Level | Description |
|---|---|
| 0 | No calibration parameter tuning |
| 1 | May apply static calibration parameter tuning |
| 2+ | Block-wise calibration parameter tuning based on quantization loss |
Calibration Data List
The data_list_path file should list one calibration file per line. Paths can be absolute or relative to the directory containing the list file. For multi-input models, ensure the file order is consistent across inputs.
data/calib/image_001.JPEG
data/calib/image_002.JPEG
data/calib/image_003.JPEG
Custom Preprocessing
Set preprocess_file to "path/to/script.py:function_name" to use a custom preprocessing function:
from typing import Sequence
import torch
import cv2
import numpy as np
def preprocess_impl(path_list: Sequence[str], input_parametr: dict) -> torch.Tensor:
"""
Read files from path_list, preprocess using input_parametr, and return a torch.Tensor.
Args:
path_list: List of file paths for one calibration batch.
input_parametr: Equivalent to calibration_parameters.input_parametres[idx].
Returns:
A batched torch.Tensor of calibration data.
"""
batch_list = []
mean_value = input_parametr["mean_value"]
std_value = input_parametr["std_value"]
input_shape = input_parametr["input_shape"]
for file_path in path_list:
img = cv2.imread(file_path)
img = cv2.resize(img, (input_shape[-1], input_shape[-2]))
img = img.astype(np.float32)
img = (img - mean_value) / std_value
img = np.transpose(img, (2, 0, 1))
img = torch.unsqueeze(torch.from_numpy(img), 0)
batch_list.append(img)
return torch.cat(batch_list, dim=0)
Samples
See the samples directory for ready-to-run examples covering ResNet-18, MobileNet V3, BERT, and more.
Changelog
For a full list of changes, see the Releases page.
| Version | Highlights |
|---|---|
| 2.0.8 | Latest development version |
| 2.0.7 | Fix FP16 conversion bug on complex models |
| 2.0.6 | Fix metadata props deletion; default CLI behavior changed to model simplification (use --dynq for dynamic quantization) |
Contributing
Contributions are welcome! Please open an issue or submit a pull request.
License
This project is licensed under the Apache License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file xslim-2.0.8.tar.gz.
File metadata
- Download URL: xslim-2.0.8.tar.gz
- Upload date:
- Size: 262.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cba768f02fa020fd23475cfcdcab3dd1161c585fe969cdd4a60a92a2ee3e9427
|
|
| MD5 |
f4da47d59f10d75bf16604f374c23588
|
|
| BLAKE2b-256 |
e9d4442805fb4cb46b60be0868faeccf3cab380b7856fb03cee77518c698883b
|
Provenance
The following attestation bundles were made for xslim-2.0.8.tar.gz:
Publisher:
publish.yml on spacemit-com/xslim
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xslim-2.0.8.tar.gz -
Subject digest:
cba768f02fa020fd23475cfcdcab3dd1161c585fe969cdd4a60a92a2ee3e9427 - Sigstore transparency entry: 1008442174
- Sigstore integration time:
-
Permalink:
spacemit-com/xslim@fbec53d61bc970d230740409134d9fbd9c2aca0b -
Branch / Tag:
refs/tags/2.0.8 - Owner: https://github.com/spacemit-com
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@fbec53d61bc970d230740409134d9fbd9c2aca0b -
Trigger Event:
release
-
Statement type:
File details
Details for the file xslim-2.0.8-py3-none-any.whl.
File metadata
- Download URL: xslim-2.0.8-py3-none-any.whl
- Upload date:
- Size: 297.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b2d60454d38af70396d19da7bafd8fcf5cf92ed9a4d17b25b361206e1183df36
|
|
| MD5 |
05bb0f39c05670f4552d5e1440d1bf0a
|
|
| BLAKE2b-256 |
ab21f71c4c779a38c2a255ba923b9ebc706f57ced6e5646ac7aa4ec585120b77
|
Provenance
The following attestation bundles were made for xslim-2.0.8-py3-none-any.whl:
Publisher:
publish.yml on spacemit-com/xslim
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xslim-2.0.8-py3-none-any.whl -
Subject digest:
b2d60454d38af70396d19da7bafd8fcf5cf92ed9a4d17b25b361206e1183df36 - Sigstore transparency entry: 1008442177
- Sigstore integration time:
-
Permalink:
spacemit-com/xslim@fbec53d61bc970d230740409134d9fbd9c2aca0b -
Branch / Tag:
refs/tags/2.0.8 - Owner: https://github.com/spacemit-com
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@fbec53d61bc970d230740409134d9fbd9c2aca0b -
Trigger Event:
release
-
Statement type: