Skip to main content

Multithreaded SIMD int8 and int4 quantization kernels.

Project description

pi-quant: Prime Intellect Fast Quantization Library

logo.png

Overview

Fast, multithreaded CPU quantization kernels with various rounding modes, outperforming PyTorch’s built-in quantization routines by more than 2 times on all tested hardware. The kernels are optimized with SIMD intrinsics for different CPU architectures, including AMD64 (SSE4.2, AVX2, AVX512F) and ARM64 (Neon). The most optimal kernel is selected at runtime using runtime CPU detection.

What is Quantization?

Quantization is the process of mapping continuous values into a finite, discrete set of values. In machine learning and signal processing, it is commonly used to reduce the precision of numerical data, lowering memory usage and improving computational efficiency while maintaining acceptable accuracy.

Features

✅ Parallel De/Quantization: Efficiently quantizes and de-quantizes data using multiple threads.

✅ Rich Datatype Support: Provides f32, f64 ↔ (u)int8/16/32/64.

✅ Modern Python API: Use the library from Python with PyTorch, numpy or standalone.

✅ Architecture-Specific Optimizations: The kernels are optimized with SIMD intrinsics for different CPU architectures, including AMD64 (SSE4.2, AVX2, AVX512F) and ARM64 (Neon).

✅ Thread Pool: Reuses threads for minimal overhead.

✅ Flexible Rounding Modes: Supports both nearest and stochastic rounding modes.

✅ C99 API: Provides a C99 API for C projects or foreign language bindings (see quant.h).

✅ Store Operators: Supports multiple store modes (SET, ADD) during dequantization — useful for ring-reduction operations.

✅ Quantization Parameters: Efficient SIMD-parallel computation of quantization scale and zero point from input data.

Benchmarks

Benchmark

The benchmarks were run on a variety of hardware. We benchmark against PyTorch’s torch.quantize_per_tensor and **torch.ao.quantization.fx._decomposed.quantize_per_tensor**. Each benchmark quantized float32 to uint8 across 1000 runs. The number of elements and other details can be seen in the benchmark code.

Benchmark 1 (AMD EPYC 9654, 360 vCPUs)

1000 runs with numel 27264000
CPU: AMD EPYC 9654 96-Core Processor, Runtime: AVX512-F
Memory: 1485 GB
Linux: 6.8.0-57-generic

bench1.png Torch FX Quant refers to torch.ao.quantization.fx._decomposed.quantize_per_tensor, Torch Builtin Quant to **torch.quantize_per_tensor** and Fast Quant to pi-quant’s piquant.quantize_torch.

Benchmark 2 (AMD EPYC 7742, 128 vCPUs)

1000 runs with numel 27264000
CPU: AMD EPYC 7742 64-Core Processor, Runtime: AVX2
Memory: 528 GB
Linux: 6.8.0-1023-nvidia
bench2.png

Benchmark 3 (Apple M3 Pro)

1000 runs with numel 27264000
CPU: Apple M3 Pro, Runtime: Neon
Memory: 18 GB
OSX: 15.4 (24E248)
bench3.png

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pypiquant-0.4.0-cp313-cp313-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ x86-64

pypiquant-0.4.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (115.7 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pypiquant-0.4.0-cp312-cp312-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

pypiquant-0.4.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (115.7 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pypiquant-0.4.0-cp311-cp311-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

pypiquant-0.4.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (115.7 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pypiquant-0.4.0-cp310-cp310-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

pypiquant-0.4.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (115.7 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pypiquant-0.4.0-cp39-cp39-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

pypiquant-0.4.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (115.7 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pypiquant-0.4.0-cp38-cp38-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8musllinux: musl 1.2+ x86-64

pypiquant-0.4.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (115.7 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file pypiquant-0.4.0-cp313-cp313-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp313-cp313-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 7efb20998305b310ec043e7afed7586340239c6b85fca6f3eabdbd2c57d183d8
MD5 c8d3e01da17a419cc1871d237c782527
BLAKE2b-256 6aea5249aded27dce27124168b3396b6f7716d8436b89fa16dc7f32b48a3ba55

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9f0953cb69b6b866ab5f235912338c949b787566243263149ec0740986aa5f14
MD5 1e529bed16ce58be1b89bf935d1049d4
BLAKE2b-256 57c232fe34e203269739b73580ae72f9d2d4f8bdc0aa2c11ab6750be864ad033

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 230d937064686149c6bb67f5fab73deef1facfc456464993b7d01b1acd2b487c
MD5 8deab8c37472152d654cc7782629af00
BLAKE2b-256 8ff94b903a03eb8656e56f5e8d341afd3a951d46dc07e39e11a054aaff6e8c74

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1384c2f6307b2139ecf7b87683cf8345611fc0f3c75d0f4de45319c6b5f3ed9d
MD5 4a2b9961fc1ede60b7a4af99b94a5c2b
BLAKE2b-256 216d77c4f934fbec51ca740537e5cb4269938b501f5497d4cb4d6bedbeb08879

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 d128aac51e4f4c5f564240fa2f852b34be7a4b3ddffb450c1cf8ba9daee8d6a9
MD5 b5cbfaddcb20ed9d572a445543f8c6cc
BLAKE2b-256 e7d6e24c2c07b4176ef4d5259dbe743f72fdb8d065eae09089ac287e9b80ac57

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 cfcfb6a8d7c9089d82d80947098c35fccaf4442aad64a57eec3b83c0ae11e785
MD5 1d42902b34e9c868c5a784bf7802ae5e
BLAKE2b-256 cd97650bad9db20cba2df0aca49f1bfac653d8906d202c17c40d09db6961363a

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 433a416ef12dddf4b500103e9cdd0488535972174bf2502a78d3cce40b73eb5e
MD5 63eea74b627877d793fe8d1660f612d6
BLAKE2b-256 1ed8509421a8203f3be05a3b6d1f2986b349f9615b594d1f98ebcc7233514970

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a9a49725218d0728013bc6df44bb5c9050c1203c6c825b1ab84ba47ddd38989b
MD5 3ec068458ce2a13e3cf411e7aa304fad
BLAKE2b-256 8e8e5d93931e4693a158190ac949542a3176095421a4fccfa07596c499478bcf

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 96d0b9ed62ad3b7222b05a39632aa19074b6d176859946020cf45712d2ca2b01
MD5 ee2b698560225529873d99e9eb1ec7df
BLAKE2b-256 db31e187bf206139bd3d38d8d7eec7b633d85a4c986b8ee3ea4d63131227069e

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 dec9951da6a61c9e5e79dd569c8375c276d52e91bced5266944055725f645e81
MD5 1be2511df79956e8f7cdd9a430ac87de
BLAKE2b-256 e3ab9dac949704bc28c0b86a5bc51572248df9ecc6e29e15aa9eef020d0ed244

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp38-cp38-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp38-cp38-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 11008d511fffcb97f2729d50f75ebfa48c525171f697d87c8daccb2878ee5e09
MD5 e5a56833c3a5de24c91a17a8828057bb
BLAKE2b-256 2b1d3850294de3bcb7dfb332190d38b8dccc966dece3e7443676a1f129cf24a1

See more details on using hashes here.

File details

Details for the file pypiquant-0.4.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pypiquant-0.4.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1c3abd762a486bcf20247a4ccddf457464bcd34ebf666e9d42c30af13c28e4e0
MD5 91e37369e2edf90ef044a76ed2a98fb2
BLAKE2b-256 5f596a1c4ee469282c823f932a3fa26029f79ec98b50a65dbf071dd2f1261b84

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page