Skip to main content

An efficent implementation for the paper: "The Era of 1-bit LLMs"

Project description

BitMat: Improving Ternary Matrix Multiplication with Triton

0️⃣1️⃣ Introduction

BitMat is a Python package designed to optimize matrix multiplication operations by utilizing custom kernels written in Triton. Our package leverages the principles outlined in the "1bit-LLM Era" paper, specifically utilizing packed int8 data to enhance computational efficiency and performance in deep learning and numerical computing tasks.

🎛 Features

Custom Triton Kernels: Utilize highly optimized kernels for matrix multiplication, tailored for performance and efficiency.

Packed int8 Operations: During inference the model uses packed int8 data to reduce memory usage and improve computational efficiency.

Ease of Integration: BitMat is designed to be easily integrated into existing PyTorch/transformers workflows, providing a seamless user experience.

💾 Installation

pip install bitmat-tl

At the moment we only support Linux platforms. Windows installation is possible but is not tested.

🏁 Quick Start

High-level API (tranformers-compatible)

from transformers import AutoModelForCausalLM
from bitmat import convert_hf_model

# Initialize your model
model= AutoModelForCausalLM.from_pretrained("some-repo/some-model")
# Convert the model to use BitLinear layers
model = convert_hf_model(model)

Low-level API

import torch
from bitmat import BitLinear

layer = BitLinear(in_features=1024, out_features=512, bias=True, eps=1e-5)
# You can use the layer as a normal torch.nn.Linear layer

🫱🏼‍🫲🏽 Contributing

We welcome contributions from the community, whether it's adding new features, improving documentation, or reporting bugs. Please refer to our contribution guidelines before making a pull request.

📜 License

BitMat is open-sourced under the Apache-2.0 license.

Citation

If you use BitMat in your research, please cite it using the following Bibtex entry:

@article{bitmat2024,
  title={BitMat: Improving Matrix Multiplication with Custom Triton Kernels},
  author={AstraMind AI},
  journal={https://github.com/astramind-ai/BitMat},
  year={2024}
}

Support

For questions, issues, or support regarding BitMat, please open an issue on our GitHub repository.

Acknowledgments

Special thanks to the Triton community and the authors of the "1bit-LLM Era" paper for their groundbreaking work and inspiration.

Also thanks to the developer od BitDelta and UnSloth since part of the code is based on their work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bitmat-tl-0.2.3.tar.gz (61.8 kB view details)

Uploaded Source

Built Distribution

bitmat_tl-0.2.3-py3-none-any.whl (65.4 kB view details)

Uploaded Python 3

File details

Details for the file bitmat-tl-0.2.3.tar.gz.

File metadata

  • Download URL: bitmat-tl-0.2.3.tar.gz
  • Upload date:
  • Size: 61.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.13

File hashes

Hashes for bitmat-tl-0.2.3.tar.gz
Algorithm Hash digest
SHA256 6b7a4f3a4035ad99a318687f3f49082e21d09afad01366bf67a887a6a5e7a402
MD5 efa90b857ff8948d3e001938a117ae64
BLAKE2b-256 1749da3f4f5aaf4a46afa4de5860f78d6463e9fae1918c08bece385d1caa097b

See more details on using hashes here.

File details

Details for the file bitmat_tl-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: bitmat_tl-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 65.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.13

File hashes

Hashes for bitmat_tl-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 dec4d5f0a18a5debeee7e04194d5b782dc265b21d375251b4cb65c04ac16e35e
MD5 7650d1094bec31c03a0918be74f6b108
BLAKE2b-256 8bb027fd40017999aca37a3028dadf3f22eb55b06e68f68c9b97f55e6bafc4b8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page