Uni-Quant: CUDA-accelerated quantization/dequantization for TensorFlow models

These details have not been verified by PyPI

Project description

Uni-Quant

Small library to quantize/dequantize TensorFlow models using PyTorch CUDA kernels.

Requirements

Python: 3.13.13 (haven't tested on any other)
CUDA Toolkit: >=12.8
Python Dependencies: All required packages are listed in requirements.txt

Installing Dependencies

pip install -r requirements.txt

Installation from pip

pip install uni-quant-cuda

Usage

Importing Functions

For installed package (from pip):

from Uniquant import quantize, dequantize, dequantize_save

For local clone repo:

from uniquant import quantize, dequantize, dequantize_save

Main Functions

`quantize(model_path, quant_directory="", quant_name="", pack_size=32, quant_size=4, overwrite=False)`

Quantizes a TensorFlow or XGBoost model.

Arguments:

model_path (str): Path to the model to quantize (with extension)
quant_directory (str): Directory path to save the quantized model
quant_name (str): Filename for the quantized model
pack_size (int): Number of weights in one quantization batch (must be divisible by 2)
quant_size (int): Number of bits per weight (available: 4 or 8)
overwrite (bool): Whether to overwrite existing file

`dequantize(quant_path, literal=False, balanced=True)`

Dequantizes a model and returns it.

Arguments:

quant_path (str): Path to the .uniq file to dequantize
literal (bool): Whether weights should be unscaled
balanced (bool): Whether weights should be balanced around 0

`dequantize_save(quant_path, model_directory="", model_name="", overwrite=False)`

Dequantizes a model, saves it, and returns it.

Arguments:

quant_path (str): Path to the .uniq file to dequantize
model_directory (str): Directory path to save the dequantized model
model_name (str): Filename for the dequantized model
overwrite (bool): Whether to overwrite existing file

Notes

This package compiles CUDA kernels at runtime using torch.utils.cpp_extension.load_inline.
Installing and using the CUDA compilation requires a compatible CUDA toolkit on the target machine (tested with >=12.8).

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- POSIX :: Linux
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.2.8

Jun 10, 2026

0.2.7

Jun 10, 2026

0.2.6

Jun 10, 2026

0.2.5

Jun 10, 2026

This version

0.2.4

Jun 10, 2026

0.2.3

Jun 10, 2026

0.2.2

May 17, 2026

0.2.1

May 17, 2026

0.2.0

May 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uni_quant_cuda-0.2.4.tar.gz (7.6 kB view details)

Uploaded Jun 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

uni_quant_cuda-0.2.4-py3-none-any.whl (2.9 kB view details)

Uploaded Jun 10, 2026 Python 3

File details

Details for the file uni_quant_cuda-0.2.4.tar.gz.

File metadata

Download URL: uni_quant_cuda-0.2.4.tar.gz
Upload date: Jun 10, 2026
Size: 7.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for uni_quant_cuda-0.2.4.tar.gz
Algorithm	Hash digest
SHA256	`944af8b9e91759cabb709b86a0488628d09c4107c98dfdecff97d3d4579df128`
MD5	`93ea4e0b41d0241d7138056870b6ea40`
BLAKE2b-256	`e7b0619a74f01499c204d7c318df3c9901e221f2ee8d0b5500b6ebe98ca7ae2a`

See more details on using hashes here.

File details

Details for the file uni_quant_cuda-0.2.4-py3-none-any.whl.

File metadata

Download URL: uni_quant_cuda-0.2.4-py3-none-any.whl
Upload date: Jun 10, 2026
Size: 2.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for uni_quant_cuda-0.2.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bbc3084c112009270b2bb5acbbb3b148f0969455a6d228d126a9c5688ec7119a`
MD5	`84e1cfd974583a455a0910ca90e18dee`
BLAKE2b-256	`401eb3095ff4fbf973d9b4ff2b0fe93009a9067e20073ae7384386481d11f146`

See more details on using hashes here.

uni-quant-cuda 0.2.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Uni-Quant

Requirements

Installing Dependencies

Installation from pip

Usage

Importing Functions

Main Functions

`quantize(model_path, quant_directory="", quant_name="", pack_size=32, quant_size=4, overwrite=False)`

`dequantize(quant_path, literal=False, balanced=True)`

`dequantize_save(quant_path, model_directory="", model_name="", overwrite=False)`

Notes

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes