Skip to main content

k-bit optimizers and matrix multiplication routines.

Project description

bitsandbytes

License Downloads Nightly Unit Tests GitHub Release PyPI - Python Version

bitsandbytes enables accessible large language models via k-bit quantization for PyTorch. We provide three main features for dramatically reducing memory consumption for inference and training:

  • 8-bit optimizers uses block-wise quantization to maintain 32-bit performance at a small fraction of the memory cost.
  • LLM.int8() or 8-bit quantization enables large language model inference with only half the required memory and without any performance degradation. This method is based on vector-wise quantization to quantize most features to 8-bits and separately treating outliers with 16-bit matrix multiplication.
  • QLoRA or 4-bit quantization enables large language model training with several memory-saving techniques that don't compromise performance. This method quantizes a model to 4-bits and inserts a small set of trainable low-rank adaptation (LoRA) weights to allow training.

The library includes quantization primitives for 8-bit & 4-bit operations, through bitsandbytes.nn.Linear8bitLt and bitsandbytes.nn.Linear4bit and 8-bit optimizers through bitsandbytes.optim module.

System Requirements

bitsandbytes has the following minimum requirements for all platforms:

  • Python 3.10+
  • PyTorch 2.3+
    • Note: While we aim to provide wide backwards compatibility, we recommend using the latest version of PyTorch for the best experience.

Accelerator support:

Note: this table reflects the status of the current development branch. For the latest stable release, see the document in the 0.49.0 tag.

Legend:

🚧 = In Development, 〰️ = Partially Supported, ✅ = Supported, 🐢 = Slow Implementation Supported, ❌ = Not Supported

Platform Accelerator Hardware Requirements LLM.int8() QLoRA 4-bit 8-bit Optimizers
🐧 Linux, glibc >= 2.24
x86-64 ◻️ CPU Minimum: AVX2
Optimized: AVX512F, AVX512BF16
🟩 NVIDIA GPU
cuda
SM60+ minimum
SM75+ recommended
🟥 AMD GPU
cuda
CDNA: gfx90a, gfx942, gfx950
RDNA: gfx1100, gfx1101, gfx1150, gfx1151, gfx1200, gfx1201
〰️
🟦 Intel GPU
xpu
Data Center GPU Max Series
Arc A-Series (Alchemist)
Arc B-Series (Battlemage)
〰️
🟪 Intel Gaudi
hpu
Gaudi2, Gaudi3 〰️
aarch64 ◻️ CPU
🟩 NVIDIA GPU
cuda
SM75+
🪟 Windows 11 / Windows Server 2022+
x86-64 ◻️ CPU AVX2
🟩 NVIDIA GPU
cuda
SM60+ minimum
SM75+ recommended
🟦 Intel GPU
xpu
Arc A-Series (Alchemist)
Arc B-Series (Battlemage)
〰️
🍎 macOS 14+
arm64 ◻️ CPU Apple M1+
⬜ Metal
mps
Apple M1+ 🐢 🐢

:book: Documentation

:heart: Sponsors

The continued maintenance and development of bitsandbytes is made possible thanks to the generous support of our sponsors. Their contributions help ensure that we can keep improving the project and delivering valuable updates to the community.

Hugging Face   Intel

License

bitsandbytes is MIT licensed.

How to cite us

If you found this library useful, please consider citing our work:

QLoRA

@article{dettmers2023qlora,
  title={Qlora: Efficient finetuning of quantized llms},
  author={Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke},
  journal={arXiv preprint arXiv:2305.14314},
  year={2023}
}

LLM.int8()

@article{dettmers2022llmint8,
  title={LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale},
  author={Dettmers, Tim and Lewis, Mike and Belkada, Younes and Zettlemoyer, Luke},
  journal={arXiv preprint arXiv:2208.07339},
  year={2022}
}

8-bit Optimizers

@article{dettmers2022optimizers,
  title={8-bit Optimizers via Block-wise Quantization},
  author={Dettmers, Tim and Lewis, Mike and Shleifer, Sam and Zettlemoyer, Luke},
  journal={9th International Conference on Learning Representations, ICLR},
  year={2022}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bitsandbytes-0.49.1-py3-none-win_amd64.whl (54.7 MB view details)

Uploaded Python 3Windows x86-64

bitsandbytes-0.49.1-py3-none-manylinux_2_24_x86_64.whl (59.1 MB view details)

Uploaded Python 3manylinux: glibc 2.24+ x86-64

bitsandbytes-0.49.1-py3-none-manylinux_2_24_aarch64.whl (31.1 MB view details)

Uploaded Python 3manylinux: glibc 2.24+ ARM64

bitsandbytes-0.49.1-py3-none-macosx_14_0_arm64.whl (129.8 kB view details)

Uploaded Python 3macOS 14.0+ ARM64

File details

Details for the file bitsandbytes-0.49.1-py3-none-win_amd64.whl.

File metadata

File hashes

Hashes for bitsandbytes-0.49.1-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 6ead0763f4beb936f9a09acb49ec094a259180906fc0605d9ca0617249c3c798
MD5 94843468ef88f39231b5b2475b664eea
BLAKE2b-256 ba537cfbe3a93354764be85c2dfcbfc5b6536413e598155aa7ef7e85d74c9e49

See more details on using hashes here.

Provenance

The following attestation bundles were made for bitsandbytes-0.49.1-py3-none-win_amd64.whl:

Publisher: python-package.yml on bitsandbytes-foundation/bitsandbytes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bitsandbytes-0.49.1-py3-none-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for bitsandbytes-0.49.1-py3-none-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 e7940bf32457dc2e553685285b2a86e82f5ec10b2ae39776c408714f9ae6983c
MD5 238c728b648921fa7792f365d31d8663
BLAKE2b-256 1d4f02d3cb62a1b0b5a1ca7ff03dce3606be1bf3ead4744f47eb762dbf471069

See more details on using hashes here.

Provenance

The following attestation bundles were made for bitsandbytes-0.49.1-py3-none-manylinux_2_24_x86_64.whl:

Publisher: python-package.yml on bitsandbytes-foundation/bitsandbytes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bitsandbytes-0.49.1-py3-none-manylinux_2_24_aarch64.whl.

File metadata

File hashes

Hashes for bitsandbytes-0.49.1-py3-none-manylinux_2_24_aarch64.whl
Algorithm Hash digest
SHA256 acd4730a0db3762d286707f4a3bc1d013d21dd5f0e441900da57ec4198578d4e
MD5 76bee2dc2ba64a6c645c18529a1a42cf
BLAKE2b-256 11dd5820e09213a3f7c0ee5aff20fce8b362ce935f9dd9958827274de4eaeec6

See more details on using hashes here.

Provenance

The following attestation bundles were made for bitsandbytes-0.49.1-py3-none-manylinux_2_24_aarch64.whl:

Publisher: python-package.yml on bitsandbytes-foundation/bitsandbytes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bitsandbytes-0.49.1-py3-none-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for bitsandbytes-0.49.1-py3-none-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 9de01d4384b6c71ef9ab052b98457dc0e4fff8fe06ab14833b5b712700deb005
MD5 8161ad8cb1bf0f63cf591c88bb259f4e
BLAKE2b-256 196f32d0526e4e4ad309d9e7502c018399bb23b63f39277a361c305092e2f885

See more details on using hashes here.

Provenance

The following attestation bundles were made for bitsandbytes-0.49.1-py3-none-macosx_14_0_arm64.whl:

Publisher: python-package.yml on bitsandbytes-foundation/bitsandbytes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page