XSlim is an offline quantization tools based on PPQ

These details have not been verified by PyPI

Development Status
- 3 - Alpha
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

XSlim

XSlim is a Post-Training Quantization (PTQ) tool developed by SpacemiT. It integrates chip-optimized quantization strategies and provides a unified interface for ONNX model quantization via JSON configuration files.

Features
Installation
Quick Start
Configuration Reference
Samples
Changelog
Contributing
License

Features

INT8 / FP16 / Dynamic Quantization – multiple precision levels for different deployment scenarios
JSON-driven configuration – simple, declarative quantization setup
Python API & CLI – use as a library or from the command line
Custom preprocessing – plug in your own preprocessing functions
ONNX-based workflow – built on the ONNX ecosystem

Installation

pip install xslim

Or install from source:

git clone https://github.com/spacemit-com/xslim.git
cd xslim
pip install -r requirements.txt

Quick Start

Python API

import xslim

# Using a JSON config file
xslim.quantize_onnx_model("config.json")

# Using a dict
config = {
    "model_parameters": {
        "onnx_model": "model.onnx",
        "working_dir": "./output"
    },
    "calibration_parameters": {
        "input_parametres": [{
            "mean_value": [123.675, 116.28, 103.53],
            "std_value": [58.395, 57.12, 57.375],
            "color_format": "rgb",
            "preprocess_file": "PT_IMAGENET",
            "data_list_path": "./calib_img_list.txt"
        }]
    }
}
xslim.quantize_onnx_model(config)

# You can also pass the model path and output path directly
xslim.quantize_onnx_model("config.json", "input.onnx", "output.onnx")

Command Line

# INT8 quantization with a JSON config
python -m xslim --config config.json

# Specify input and output model paths
python -m xslim -c config.json -i input.onnx -o output.onnx

# Dynamic quantization (no config file needed)
python -m xslim -i input.onnx -o output.onnx --dynq

# FP16 conversion (no config file needed)
python -m xslim -i input.onnx -o output.onnx --fp16

# ONNX simplification only (no config file needed)
python -m xslim -i input.onnx -o output.onnx

Configuration Reference

Quantization is configured through a JSON file with three main sections: model_parameters, calibration_parameters, and quantization_parameters. All fields below are optional unless noted otherwise.

`model_parameters`

Field	Default	Description
`onnx_model`	—	Path to the input ONNX model
`output_prefix`	Model filename (output ends with `.q.onnx`)	Output file prefix
`working_dir`	Directory of `onnx_model`	Output and working directory
`skip_onnxsim`	`false`	Skip ONNX simplification

`calibration_parameters`

Field	Default	Options	Description
`calibration_step`	`100`	—	Max number of calibration samples (recommended 100–1000)
`calibration_device`	`cuda`	`cuda`, `cpu`	Auto-detected; falls back to `cpu`
`calibration_type`	`default`	`default`, `kl`, `minmax`, `percentile`, `mse`	Calibration observer algorithm
`input_parametres`	—	—	List of per-input settings (see below)

`input_parametres` (per input)

Field	Default	Options	Description
`input_name`	Read from ONNX model	—	Input tensor name
`input_shape`	Read from ONNX model	—	Input shape (symbolic batch dim defaults to 1)
`dtype`	Read from ONNX model	`float32`, `int8`, `uint8`, `int16`	Data type
`file_type`	`img`	`img`, `npy`, `raw`	Calibration file type
`color_format`	`bgr`	`rgb`, `bgr`	Image color format
`mean_value`	`None`	—	Per-channel mean for normalization
`std_value`	`None`	—	Per-channel std for normalization
`preprocess_file`	`None`	`PT_IMAGENET`, `IMAGENET`, or custom path	Preprocessing function (see below)
`data_list_path`	required	—	Path to calibration file list

`quantization_parameters`

Field	Default	Options	Description
`precision_level`	`0`	`0`, `1`, `2`, `3`, `4`	See precision levels below
`finetune_level`	`1`	`0`, `1`, `2`, `3`	See fine-tune levels below
`analysis_enable`	`true`	—	Enable post-quantization analysis
`max_percentile`	`0.9999`	≥ `0.99`	Percentile clipping range
`custom_setting`	`None`	—	Per-subgraph overrides (list)
`truncate_var_names`	`[]`	—	Tensor names to split the graph
`ignore_op_types`	`[]`	—	Op types to skip during quantization
`ignore_op_names`	`[]`	—	Op names to skip during quantization

Precision Levels

Level	Description
0	Full INT8 quantization (default)
1	Partial INT8, suitable for general Transformer models
2	Partial INT8 with highest precision
3	Dynamic quantization
4	FP16 conversion

Fine-tune Levels

Level	Description
0	No calibration parameter tuning
1	May apply static calibration parameter tuning
2+	Block-wise calibration parameter tuning based on quantization loss

Calibration Data List

The data_list_path file should list one calibration file per line. Paths can be absolute or relative to the directory containing the list file. For multi-input models, ensure the file order is consistent across inputs.

data/calib/image_001.JPEG
data/calib/image_002.JPEG
data/calib/image_003.JPEG

Custom Preprocessing

Set preprocess_file to "path/to/script.py:function_name" to use a custom preprocessing function:

from typing import Sequence
import torch
import cv2
import numpy as np

def preprocess_impl(path_list: Sequence[str], input_parametr: dict) -> torch.Tensor:
    """
    Read files from path_list, preprocess using input_parametr, and return a torch.Tensor.

    Args:
        path_list: List of file paths for one calibration batch.
        input_parametr: Equivalent to calibration_parameters.input_parametres[idx].

    Returns:
        A batched torch.Tensor of calibration data.
    """
    batch_list = []
    mean_value = input_parametr["mean_value"]
    std_value = input_parametr["std_value"]
    input_shape = input_parametr["input_shape"]
    for file_path in path_list:
        img = cv2.imread(file_path)
        img = cv2.resize(img, (input_shape[-1], input_shape[-2]))
        img = img.astype(np.float32)
        img = (img - mean_value) / std_value
        img = np.transpose(img, (2, 0, 1))
        img = torch.unsqueeze(torch.from_numpy(img), 0)
        batch_list.append(img)
    return torch.cat(batch_list, dim=0)

Samples

See the samples directory for ready-to-run examples covering ResNet-18, MobileNet V3, BERT, and more.

Changelog

For a full list of changes, see the Releases page.

Version	Highlights
2.0.8	Latest development version
2.0.7	Fix FP16 conversion bug on complex models
2.0.6	Fix metadata props deletion; default CLI behavior changed to model simplification (use `--dynq` for dynamic quantization)

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

License

This project is licensed under the Apache License 2.0.

Project details

These details have not been verified by PyPI

Development Status
- 3 - Alpha
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

2.0.14

Apr 29, 2026

2.0.13

Apr 20, 2026

2.0.12

Apr 14, 2026

2.0.11

Mar 29, 2026

2.0.10

Mar 20, 2026

2.0.9

Mar 12, 2026

This version

2.0.8

Mar 2, 2026

2.0.7

Dec 3, 2025

2.0.6

Nov 18, 2025

2.0.5

Oct 16, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xslim-2.0.8.tar.gz (262.8 kB view details)

Uploaded Mar 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

xslim-2.0.8-py3-none-any.whl (297.8 kB view details)

Uploaded Mar 2, 2026 Python 3

File details

Details for the file xslim-2.0.8.tar.gz.

File metadata

Download URL: xslim-2.0.8.tar.gz
Upload date: Mar 2, 2026
Size: 262.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xslim-2.0.8.tar.gz
Algorithm	Hash digest
SHA256	`cba768f02fa020fd23475cfcdcab3dd1161c585fe969cdd4a60a92a2ee3e9427`
MD5	`f4da47d59f10d75bf16604f374c23588`
BLAKE2b-256	`e9d4442805fb4cb46b60be0868faeccf3cab380b7856fb03cee77518c698883b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for xslim-2.0.8.tar.gz:

Publisher: publish.yml on spacemit-com/xslim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: xslim-2.0.8.tar.gz
- Subject digest: cba768f02fa020fd23475cfcdcab3dd1161c585fe969cdd4a60a92a2ee3e9427
- Sigstore transparency entry: 1008442174
- Sigstore integration time: Mar 2, 2026
Source repository:
- Permalink: spacemit-com/xslim@fbec53d61bc970d230740409134d9fbd9c2aca0b
- Branch / Tag: refs/tags/2.0.8
- Owner: https://github.com/spacemit-com
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@fbec53d61bc970d230740409134d9fbd9c2aca0b
- Trigger Event: release

File details

Details for the file xslim-2.0.8-py3-none-any.whl.

File metadata

Download URL: xslim-2.0.8-py3-none-any.whl
Upload date: Mar 2, 2026
Size: 297.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xslim-2.0.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b2d60454d38af70396d19da7bafd8fcf5cf92ed9a4d17b25b361206e1183df36`
MD5	`05bb0f39c05670f4552d5e1440d1bf0a`
BLAKE2b-256	`ab21f71c4c779a38c2a255ba923b9ebc706f57ced6e5646ac7aa4ec585120b77`

See more details on using hashes here.

Provenance

The following attestation bundles were made for xslim-2.0.8-py3-none-any.whl:

Publisher: publish.yml on spacemit-com/xslim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: xslim-2.0.8-py3-none-any.whl
- Subject digest: b2d60454d38af70396d19da7bafd8fcf5cf92ed9a4d17b25b361206e1183df36
- Sigstore transparency entry: 1008442177
- Sigstore integration time: Mar 2, 2026
Source repository:
- Permalink: spacemit-com/xslim@fbec53d61bc970d230740409134d9fbd9c2aca0b
- Branch / Tag: refs/tags/2.0.8
- Owner: https://github.com/spacemit-com
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@fbec53d61bc970d230740409134d9fbd9c2aca0b
- Trigger Event: release

xslim 2.0.8

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

XSlim

Features

Installation

Quick Start

Python API

Command Line

Configuration Reference

model_parameters

calibration_parameters

input_parametres (per input)

quantization_parameters

Precision Levels

Fine-tune Levels

Calibration Data List

Custom Preprocessing

Samples

Changelog

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`model_parameters`

`calibration_parameters`

`input_parametres` (per input)

`quantization_parameters`