Skip to main content

Multithreaded SIMD int8 and int4 quantization kernels.

Project description

pi-quant: Prime Intellect Fast Quantization Library

logo.png

Overview

Fast, multithreaded CPU quantization kernels with various rounding modes, outperforming PyTorch’s built-in quantization routines by more than 2 times on all tested hardware. The kernels are optimized with SIMD intrinsics for different CPU architectures, including AMD64 (SSE4.2, AVX2, AVX512F) and ARM64 (Neon). The most optimal kernel is selected at runtime using runtime CPU detection.

What is Quantization?

Quantization is the process of mapping continuous values into a finite, discrete set of values. In machine learning and signal processing, it is commonly used to reduce the precision of numerical data, lowering memory usage and improving computational efficiency while maintaining acceptable accuracy.

Features

✅ Parallel De/Quantization: Efficiently quantizes and de-quantizes data using multiple threads.

✅ Rich Datatype Support: Provides f32, f64 ↔ (u)int8/16/32/64.

✅ Modern Python API: Use the library from Python with PyTorch, numpy or standalone.

✅ Architecture-Specific Optimizations: The kernels are optimized with SIMD intrinsics for different CPU architectures, including AMD64 (SSE4.2, AVX2, AVX512F) and ARM64 (Neon).

✅ Thread Pool: Reuses threads for minimal overhead.

✅ Flexible Rounding Modes: Supports both nearest and stochastic rounding modes.

✅ C99 API: Provides a C99 API for C projects or foreign language bindings (see quant.h).

✅ Store Operators: Supports multiple store modes (SET, ADD) during dequantization — useful for ring-reduction operations.

✅ Quantization Parameters: Efficient SIMD-parallel computation of quantization scale and zero point from input data.

Benchmarks

Benchmark

The benchmarks were run on a variety of hardware. We benchmark against PyTorch’s torch.quantize_per_tensor and **torch.ao.quantization.fx._decomposed.quantize_per_tensor**. Each benchmark quantized float32 to uint8 across 1000 runs. The number of elements and other details can be seen in the benchmark code.

Benchmark 1 (AMD EPYC 9654, 360 vCPUs)

1000 runs with numel 27264000
CPU: AMD EPYC 9654 96-Core Processor, Runtime: AVX512-F
Memory: 1485 GB
Linux: 6.8.0-57-generic

bench1.png Torch FX Quant refers to torch.ao.quantization.fx._decomposed.quantize_per_tensor, Torch Builtin Quant to **torch.quantize_per_tensor** and Fast Quant to pi-quant’s piquant.quantize_torch.

Benchmark 2 (AMD EPYC 7742, 128 vCPUs)

1000 runs with numel 27264000
CPU: AMD EPYC 7742 64-Core Processor, Runtime: AVX2
Memory: 528 GB
Linux: 6.8.0-1023-nvidia
bench2.png

Benchmark 3 (Apple M3 Pro)

1000 runs with numel 27264000
CPU: Apple M3 Pro, Runtime: Neon
Memory: 18 GB
OSX: 15.4 (24E248)
bench3.png

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pypiquant-4.1.0-cp313-cp313-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ x86-64

pypiquant-4.1.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (173.3 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pypiquant-4.1.0-cp312-cp312-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

pypiquant-4.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (173.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pypiquant-4.1.0-cp311-cp311-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

pypiquant-4.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (173.3 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pypiquant-4.1.0-cp310-cp310-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

pypiquant-4.1.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (173.3 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pypiquant-4.1.0-cp39-cp39-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

pypiquant-4.1.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (173.3 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pypiquant-4.1.0-cp38-cp38-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.8musllinux: musl 1.2+ x86-64

pypiquant-4.1.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (173.3 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file pypiquant-4.1.0-cp313-cp313-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp313-cp313-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 e0f7d5ae8c81f782f2d24a47b142cb841f4a4064f8262d084e984c2b770637f5
MD5 5cfe466e6ae2d1993e8b26c2351fa0ac
BLAKE2b-256 eb8a98ac37d049cece02668f89e7a0ceab74b2d0eda36424bbeffd2cc7f67a3e

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 39661ab80a8b5b09f74f9a13ae1238b53a6b783ecc149199df7717771f762a89
MD5 7265c11bccd0dcf8f232ed4d9628162d
BLAKE2b-256 07489a007d4d788e97309080fef59e702c599e13562522f9e851761753d615f4

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 e1fa33354fb816cfb4ad680f8173d50d163cbc8c89009a9e439e96a8fb7e7477
MD5 b94a6b31282a6a249062eb4045271e7e
BLAKE2b-256 94874f4464bae9b4cfa53d004bf60b9ccb06f81d68cf52f2a9266f2ed99ee4f6

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 16f6d458b3cc5ee89b6ba85cb0d25872a17631acdd62f6a342d47bebfadb7487
MD5 b9b9c6099d8f9b79873409d95d5ac8cf
BLAKE2b-256 5dec0949f37e627106779aa7e1349f01bf5045bf9530f690342d224abc9f8da0

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 6618c2c5ac33e43492f8683d30f7b7b3b01b4cc488ffb5a7e80f6c6df98dbc11
MD5 33de67f51acd6d2052c739736c1f569a
BLAKE2b-256 e75901a4f447caa93709c535dfaed27bcbc521ac016b47f4e1997b8bc87d8fdc

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 27f3e25f3d51e6ad5ccf46facffccbcf66f1b402a49c1cb5d278c65b4c066528
MD5 33ac86b41e44456199c12dd5ec9efdeb
BLAKE2b-256 a916e543dc5a83f3afd45df7faa98862fa9a1714e89c55b5a2815670461d551a

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 c73999588f226686afa420a350db0dfc04ca71a0f17ec7c276f5c0afa0013495
MD5 575672772a31ec836b3edf639e1c1b54
BLAKE2b-256 bb05dd5daaeb5cddee760858644db1f2780ece4d57631932a8a646671f45be77

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a8a2737e61385fa0918cd89eb9cf5214fe2f5db25dd435a6a51c0615f726c957
MD5 d3a4dbbc74393b7aa10584a1b19fe4cb
BLAKE2b-256 3a847dc5e38fc8c12af13003fcf11f3882c442dca7c5a448c0452253ed7cbb48

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 51dcf144998851fe5f0a7b0e0cf22a7e26f37f2b68bc4b4ea28a2b7fcb96fb2a
MD5 f829daa17405036a405cc17aa86759fb
BLAKE2b-256 f1135cc43af65d26d64818a0befbc5985980e7e69ff90b75d98c2ab250c67947

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 29682045decf2cd7bbd4815f38a170af9a0e8f7345e9768120ba63d8ca1cba03
MD5 b22c8a6a5a7983945f2cfc2329f8d1eb
BLAKE2b-256 f94ddd0247fe35fd2ae60cf78b720ca74c6079d1f54c02ae5535b0ab1b19fd5d

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp38-cp38-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp38-cp38-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1375c225414910da4c540039ea473f4f4daff7da268a1fe51598e2e0ba9bd207
MD5 6f201f641e224a9cd49a738f9c5791f5
BLAKE2b-256 7f29cc556c15e435296ae1f1ddf5d8154b7c0d01a69054c7e83056b526cb7b60

See more details on using hashes here.

File details

Details for the file pypiquant-4.1.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-4.1.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 09b9fc21c0ff905f9265cfeaf54e35a09419eb3b3ec519be1d1d92c60f72f561
MD5 646e04487baafbca0f6ddac3fc6c6eab
BLAKE2b-256 71b52a861415dc86dd1524a00a654a4af85b40b6aa5c88ca395bf4c221575d88

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page