Skip to main content

k-bit optimizers and matrix multiplication routines.

Project description

bitsandbytes

License Downloads Nightly Unit Tests GitHub Release PyPI - Python Version

bitsandbytes enables accessible large language models via k-bit quantization for PyTorch. We provide three main features for dramatically reducing memory consumption for inference and training:

  • 8-bit optimizers uses block-wise quantization to maintain 32-bit performance at a small fraction of the memory cost.
  • LLM.int8() or 8-bit quantization enables large language model inference with only half the required memory and without any performance degradation. This method is based on vector-wise quantization to quantize most features to 8-bits and separately treating outliers with 16-bit matrix multiplication.
  • QLoRA or 4-bit quantization enables large language model training with several memory-saving techniques that don't compromise performance. This method quantizes a model to 4-bits and inserts a small set of trainable low-rank adaptation (LoRA) weights to allow training.

The library includes quantization primitives for 8-bit & 4-bit operations, through bitsandbytes.nn.Linear8bitLt and bitsandbytes.nn.Linear4bit and 8-bit optimizers through bitsandbytes.optim module.

System Requirements

bitsandbytes has the following minimum requirements for all platforms:

  • Python 3.9+
  • PyTorch 2.2+
    • Note: While we aim to provide wide backwards compatibility, we recommend using the latest version of PyTorch for the best experience.

Accelerator support:

Platform Accelerator Hardware Requirements Support Status
🐧 Linux, glibc >= 2.24
x86-64 ◻️ CPU AVX2 〰️ Partial Support
🟩 NVIDIA GPU
cuda
SM50+ minimum
SM75+ recommended
✅ Full Support
🟥 AMD GPU
cuda
CDNA: gfx90a, gfx942
RDNA: gfx1100, gfx1200
🚧 In Development
🟦 Intel GPU
xpu
Data Center GPU Max Series
Arc A-Series (Alchemist)
Arc B-Series (Battlemage)
🚧 In Development
🟪 Intel Gaudi
hpu
Gaudi1, Gaudi2, Gaudi3 🚧 In Development
aarch64 ◻️ CPU 〰️ Partial Support
🟩 NVIDIA GPU
cuda
SM75, SM80, SM90, SM100 ✅ Full Support
🪟 Windows 11 / Windows Server 2019+
x86-64 ◻️ CPU AVX2 〰️ Partial Support
🟩 NVIDIA GPU
cuda
SM50+ minimum
SM75+ recommended
✅ Full Support
🟦 Intel GPU
xpu
Arc A-Series (Alchemist)
Arc B-Series (Battlemage)
🚧 In Development
🍎 macOS 13.1+
arm64 ◻️ CPU Apple M1+ 🛣️ Future Roadmap
⬜ Metal
mps
Apple M1+ 🛣️ Future Roadmap

:book: Documentation

:heart: Sponsors

The continued maintenance and development of bitsandbytes is made possible thanks to the generous support of our sponsors. Their contributions help ensure that we can keep improving the project and delivering valuable updates to the community.

Hugging Face

License

bitsandbytes is MIT licensed.

We thank Fabio Cannizzo for his work on FastBinarySearch which we use for CPU quantization.

How to cite us

If you found this library useful, please consider citing our work:

QLoRA

@article{dettmers2023qlora,
  title={Qlora: Efficient finetuning of quantized llms},
  author={Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke},
  journal={arXiv preprint arXiv:2305.14314},
  year={2023}
}

LLM.int8()

@article{dettmers2022llmint8,
  title={LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale},
  author={Dettmers, Tim and Lewis, Mike and Belkada, Younes and Zettlemoyer, Luke},
  journal={arXiv preprint arXiv:2208.07339},
  year={2022}
}

8-bit Optimizers

@article{dettmers2022optimizers,
  title={8-bit Optimizers via Block-wise Quantization},
  author={Dettmers, Tim and Lewis, Mike and Shleifer, Sam and Zettlemoyer, Luke},
  journal={9th International Conference on Learning Representations, ICLR},
  year={2022}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

bitsandbytes-0.46.1-py3-none-win_amd64.whl (72.2 MB view details)

Uploaded Python 3Windows x86-64

bitsandbytes-0.46.1-py3-none-manylinux_2_24_x86_64.whl (72.9 MB view details)

Uploaded Python 3manylinux: glibc 2.24+ x86-64

bitsandbytes-0.46.1-py3-none-manylinux_2_24_aarch64.whl (30.7 MB view details)

Uploaded Python 3manylinux: glibc 2.24+ ARM64

File details

Details for the file bitsandbytes-0.46.1-py3-none-win_amd64.whl.

File metadata

File hashes

Hashes for bitsandbytes-0.46.1-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 9f6f61376bd0e9780c5dc4ddee7d1f52cb10fe8034a1ea588611f4e8b87eb6a7
MD5 b52b5e647bb02813acc66e053acada2b
BLAKE2b-256 857d06da01fac23a5032632dd7874b31c1d9b7b9af2314b2b07e5f99641950da

See more details on using hashes here.

Provenance

The following attestation bundles were made for bitsandbytes-0.46.1-py3-none-win_amd64.whl:

Publisher: python-package.yml on bitsandbytes-foundation/bitsandbytes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bitsandbytes-0.46.1-py3-none-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for bitsandbytes-0.46.1-py3-none-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 b0ee4a204fb926d4eae02bc2f5468ae3c11c011cfa849a4c771d4c6b201f57ae
MD5 fcb9dedb90c7c1df7b7f12c4b522ad08
BLAKE2b-256 6b1ec26dbcb46cebb49fa6b17ff888966e6d8f306078b095a5df801a583549d0

See more details on using hashes here.

Provenance

The following attestation bundles were made for bitsandbytes-0.46.1-py3-none-manylinux_2_24_x86_64.whl:

Publisher: python-package.yml on bitsandbytes-foundation/bitsandbytes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bitsandbytes-0.46.1-py3-none-manylinux_2_24_aarch64.whl.

File metadata

File hashes

Hashes for bitsandbytes-0.46.1-py3-none-manylinux_2_24_aarch64.whl
Algorithm Hash digest
SHA256 21b349f776d04c6c1380405961081de29c84f49640b79d3d199b6d719818da84
MD5 9eff89dbfee8318af2513bb24ad78eab
BLAKE2b-256 d2b29dadb4f8dca3948e35c1ebfee75ca82353e41468b41ff785430595f8e6f0

See more details on using hashes here.

Provenance

The following attestation bundles were made for bitsandbytes-0.46.1-py3-none-manylinux_2_24_aarch64.whl:

Publisher: python-package.yml on bitsandbytes-foundation/bitsandbytes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page