Skip to main content

An production-ready implementation of 1.58 bit quantization-aware training and inference.

Project description

bitlinear

This project aims to provide a production-ready implementation of 1.58-bit layers for quantization-aware training and time-, memory-, and energy-efficient inference. It builds on the ideas from The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits.

installation

Installation from PyPI:

pip install bitlinear

Installation from source:

git clone https://github.com/schneiderkamplab/bitlinear
cd bitlinear
pip install .

usage

The usage is best explained by a short example:

from bitlinear import replace_modules
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("HuggingFaceM4/tiny-random-LlamaForCausalLM")
replace_modules(model)

A more elaborate example is available under examples, including training and evaluating a binary classifer:

python examples/train.py
python examples/eval.py

comparison to other work

There are other implementations of bit-linear layers, most of which get at least some of the details wrong at the time of this writing (April 2024).

The focus of this implementation is to develop:

  • a flexible production-ready drop-in replacemenbt for torch.nn.LinearLayer,
  • efficient fused kernels for training, and
  • efficient fused kernels for inference with 2-bit weights and 8-bit activations.

Furthermore, this implementation is meant to serve as a testbed for research on low-bit quantization aware training and inference.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bitlinear-1.0.0.tar.gz (3.4 kB view details)

Uploaded Source

File details

Details for the file bitlinear-1.0.0.tar.gz.

File metadata

  • Download URL: bitlinear-1.0.0.tar.gz
  • Upload date:
  • Size: 3.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for bitlinear-1.0.0.tar.gz
Algorithm Hash digest
SHA256 87a9ed19117d1b70e1567121724927d38db092d95999ec63eee45c8b62b1d2c8
MD5 62a5e517560fceccd9073bf13780f93a
BLAKE2b-256 78561e1a8cca5bf85a582e0170465912624a8d70e472acdb7651dc2f4fda0b87

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page