An efficent implementation for the paper: "The Era of 1-bit LLMs"
Project description
BitMat: Improving Ternary Matrix Multiplication with Triton
0️⃣1️⃣ Introduction
BitMat is a Python package designed to optimize matrix multiplication operations by utilizing custom kernels written in Triton. Our package leverages the principles outlined in the "1bit-LLM Era" paper, specifically utilizing packed int8 data to enhance computational efficiency and performance in deep learning and numerical computing tasks.
🎛 Features
Custom Triton Kernels: Utilize highly optimized kernels for matrix multiplication, tailored for performance and efficiency.
Packed int8 Operations: During inference the model uses packed int8 data to reduce memory usage and improve computational efficiency.
Ease of Integration: BitMat is designed to be easily integrated into existing PyTorch/transformers workflows, providing a seamless user experience.
💾 Installation
pip install bitmat-tl
At the moment we only support Linux platforms. Windows installation is possible but is not tested.
🏁 Quick Start
High-level API (tranformers-compatible)
from transformers import AutoModelForCausalLM
from bitmat import convert_hf_model
# Initialize your model
model= AutoModelForCausalLM.from_pretrained("some-repo/some-model")
# Convert the model to use BitLinear layers
model = convert_hf_model(model)
Low-level API
import torch
from bitmat import BitLinear
layer = BitLinear(in_features=1024, out_features=512, bias=True, eps=1e-5)
# You can use the layer as a normal torch.nn.Linear layer
🫱🏼🫲🏽 Contributing
We welcome contributions from the community, whether it's adding new features, improving documentation, or reporting bugs. Please refer to our contribution guidelines before making a pull request.
📜 License
BitMat is open-sourced under the Apache-2.0 license.
Citation
If you use BitMat in your research, please cite it using the following Bibtex entry:
@article{bitmat2024,
title={BitMat: Improving Matrix Multiplication with Custom Triton Kernels},
author={AstraMind AI},
journal={https://github.com/astramind-ai/BitMat},
year={2024}
}
Support
For questions, issues, or support regarding BitMat, please open an issue on our GitHub repository.
Acknowledgments
Special thanks to the Triton community and the authors of the "1bit-LLM Era" paper for their groundbreaking work and inspiration.
Also thanks to the developer od BitDelta and UnSloth since part of the code is based on their work.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bitmat-tl-0.2.0.tar.gz
.
File metadata
- Download URL: bitmat-tl-0.2.0.tar.gz
- Upload date:
- Size: 19.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b228204265fb5a8a3f610d3aba499d0397d4f138b02bd74d1fa7ebffd439eaa7 |
|
MD5 | 8555f4aad917e19e1629570e2ae567a5 |
|
BLAKE2b-256 | 4a69d610006e081872c48ae6e57b55eee6a43818fd25a775bda49c784cc43e69 |
File details
Details for the file bitmat_tl-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: bitmat_tl-0.2.0-py3-none-any.whl
- Upload date:
- Size: 20.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb6bb188655ced584766b963d62dac0031c0e55574423a91cd56f3f4ef74ed4e |
|
MD5 | f32358f5d6e821cf181b71721db4b2cc |
|
BLAKE2b-256 | 37c9a618532e421c85229d7d11102c44cb7cc7360cb816ebe37b8e8bce605149 |