Skip to main content

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Project description

🔮 GPTQ - Accurate Post-Training Compression for Generative Pretrained Transformers

This repo is a extended and polished version of the original code for the paper GPTQ: Accurate Post-training Compression for Generative Pretrained Transformers.

🔥 SOTA on LLM PTQ

  • An efficient implementation of the GPTQ algorithm
  • 2/3/4/8-bit quantized matrix full-precision vector product CUDA kernel
  • Bug fix for old consumer-grade GPU

📥 Installation

pip install gptq

🛟 Install PyTorch

gptq requires PyTorch and GPU, and installing PyTorch with CUDA is tricky. To install PyTorch correctly, the following steps are recommended:

  • run nvcc --version to get the version. For example, the following result means we have cuda compiler version 116
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0
  • run pip install light-the-torch to install ltt
  • run ltt install --pytorch-computation-backend=cu116 torch torchvision torchaudio to install the torch suite. Please replace the 116 according to your environment!

TODO

  • GPTQ with CNN

Algorithm credits go to IST Austria Distributed Algorithms and Systems Lab

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gptq-0.0.3.tar.gz (21.4 kB view details)

Uploaded Source

File details

Details for the file gptq-0.0.3.tar.gz.

File metadata

  • Download URL: gptq-0.0.3.tar.gz
  • Upload date:
  • Size: 21.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for gptq-0.0.3.tar.gz
Algorithm Hash digest
SHA256 05121652e59fd5cc9c6cf9530bb999bb4d843fdbbe81ee532e06c6f8023b812f
MD5 e36064eeaae8f9c0edb7864648f58317
BLAKE2b-256 25062e8e087ec1572fca200c591442ed92c4df7ac8a854ecdf32a7b8065ce14d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page