A Tool for adaptation Larger Transfomer-Based model and Quantization built top on libraries LoRa and LoRa-Torch.
Project description
Adapter-LoRa for Quantization
Comparative Features of "loralib" and "loratorch" Implementations
Distinguishing the "loralib" and "loratorch" Approaches for Implementation
The implementations of "loralib" and "loratorch" exhibit distinct methodologies, particularly when using the example of nn.Linear
. The underlying mathematical representations are as follows:
-
- loralib Approach
The computation is defined as:
[ h = x W_0^\top + \frac{\alpha}{r} x(BA)^\top, ]
where:
x
is an input matrix of dimensions (k \times n),W_0
is a pre-trained weight matrix of dimensions (m \times n),r
is a predefined LoRA rank,B
andA
are LoRA matrices of dimensions (m \times r) and (r \times n) respectively,\alpha
is a hyper-parameter.
- For
loralib
, $h = x W_0^\top + \frac{\alpha}{r} x(BA)^\top,$
where $x\in\mathbb{R}^{k\times n}$ is the input matrix, $W_0\in\mathbb{R}^{m\times n}$ is the pre-trained weight matrix, $r$ is the predefined LoRA rank, $B\in\mathbb{R}^{m\times r}$ and $A\in \mathbb{R}^{r\times n}$ are the LoRA matrixes, and $\alpha$ is a hyper-parameter.
- For
loratorch
, $h = x (W_0 + \frac{\alpha}{r} BA)^\top.$
loralib
computes $xW_0^\top$ and $x(BA)^\top$ respectively and then merges the results. While loratorch
merges pre-trained weight $W_0$ and its LoRA weight $BA$ and then computes the results by simply using nn.Linear.forward()
. There is no difference between loralib
and loratorch
in the linear layers. But in some no-linear or complex layers, we are no sure whether this layer satisfies $L(x, W_0)+L(x, BA) = L(x, W_0+BA)$. Hence, it is difficult to extend LoRA to some complex layers by using loralib
. On the contrary, the idea of merging weights first in loratorch
is more general and extensible. You just call merge_lora_param()
in loratorch
to merge weights and then call forward()
in the original layer to compute the results. With the help of loratorch
, you can easily implement LoRA to any type of layer of torch.nn
.
Supported Layers
loralib |
loratorch |
||
---|---|---|---|
nn.Linear |
✓ | ✓ | linear.ipynb |
nn.Embedding |
✓ | ✓ | embedding.ipynb |
nn.Conv1d |
✓ | ✓ | |
nn.Conv2d |
✓ | ✓ | |
nn.Conv3d |
✓ | ✓ | |
nn.MultiheadAttention |
✘ | ✓ | |
MergedLinear |
✓ (Error) | ✓ | mergedlinear.ipynb |
$\cdots$ | hard to extend | easy to extend |
We compare the results of loralib
and loratorch
in examples to demonstrate the correctness of the implementation in loratorch
.
Quick Start
The usage of AdapterLoRa
-
Install
AdapterLoRa
.pip install git+https://github.com/Baijiong-Lin/LoRA-Torch
pip install AdapterLoRa
Usage Tool AdpaterLoRa
import torch.nn as nn
import torch
from core.Quantized import AdapterLoRa
model = nn.TransformerEncoderLayer(d_model=512, nhead=8)
Adpate_model = AdapterLoRa(model , method="LoRa", Rank=4)
"""
adding Linear Layer built Self.attention
Replace the layers where you would like to use AdapterLoRa by using add_layer function.
"""
Adpate_model.add_layer("self_attn")
Adpate_model.add_layer("linear1")
Adpate_model.add_layer("linear2")
# reconstruct model Quantized
Adpate_model.reconstruct_model()
# Iplmented LoRa Method
model = Adpate_model.implement_lora(verbose=True)
# Total trainable parameters before LoRA: 3176960
# Total trainable parameters after LoRA: 24576
# This sets requires_grad to False for all parameters without the string "lora_" in their names
# Training loop
for batch in dataloader:
model.train()
Saving Wieghts model
- Save LoRA model (only the LoRA matrixes will be saved).
import loralib as lora
# ===== Before =====
# torch.save(model.state_dict(), checkpoint_path)
# ===== After =====
torch.save(lora.lora_state_dict(model), checkpoint_path)
Loading the Pre-Trained Model
- Load LoRA model (need to load the pre-trained model first).
import loralib as lora
# Load the pre-trained checkpoint first
model.load_state_dict(torch.load('ckpt_pretrained.pt'), strict=False)
# Then load the LoRA checkpoint
model.load_state_dict(torch.load('ckpt_lora.pt'), strict=False)
-
Quantized Model
-
Time to Train
-
Cost to Train
What's in it for you?
For each of the above four pillars, we are sharing our codebase and insights to:
-
Assist you to leverage Transfomer-Based Model for your machines needs and challenges
-
Boost reproducibility efforts which are becoming increasingly difficult with Transfomers
i am providing Tool that are ready-to-use for Quantize the model:
-
Finetuning Transfomer-Based on your proprietary dataset via PeFT methodologies such as LoRA and QLoRa
-
Performing hyperparameter optimization to get the maximum performance out of these models
What's the best way to use this repository?
Go over to the Transfomer-Based-specific directory that you are interested in, and open the README.md
. We have included details about the LLMs, followed by performance results on open-source datasets!
Roadmap
Our plan is to perform these experiments on all the Transformer-Based model below. To that end, this is a tentative roadmap of the LLMs that we aim to cover:
- TransfomerEncoder
- TransfomerDecoder
- Vision-Transfomer
- minGPT
- OpenAI GPT-2
- Inflection Pi Under Progress
Correspondence
Contributor
AdapterLoRa
is developed and maintained by
''Youness ELbrag'' (Email | LinkedIn)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file AdapterLoRa-2.0.0.tar.gz
.
File metadata
- Download URL: AdapterLoRa-2.0.0.tar.gz
- Upload date:
- Size: 7.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 142f3f17d480b4541a95dfae133a590c25e5212152fa92adf48f77f4be558a12 |
|
MD5 | f15f45fc74f01f8b4743808453a17768 |
|
BLAKE2b-256 | af396921f288b74d1ae8ddfd7f4b81ab290f75e029935bb00f73b9a700dd3cf1 |
File details
Details for the file AdapterLoRa-2.0.0-py3-none-any.whl
.
File metadata
- Download URL: AdapterLoRa-2.0.0-py3-none-any.whl
- Upload date:
- Size: 7.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 112d00ae96f710d2c900cbe3c2223ed349e3619aa75dbf01d334726e2225596b |
|
MD5 | 63a12c4418a413c1e3147e68d6743157 |
|
BLAKE2b-256 | 3a4753a777a92eab68ab8b7519de5a36734477871a631c08d52d0338251bc16f |