Efficient computation library for linear attention.
Project description
An efficient Linear Attention Decoding package
1. installation
conda create -n efficient_linear_decoding python=3.9
conda activate efficient_linear_decoding
pip install efficient_linear_decoding
The code has been test under the following environment:
triton>=2.1.0
torch>=2.1.0
pycuda
pynvml
numpy<2
You can use the following command to install:
pip install triton==2.1.0
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install pycuda
pip install pynvml
pip install numpy
2. usage
import torch
from efficient_linear_decoding.efficient_linear_decoding import causal_linear_decoder
# Create input tensor
Q = torch.randn(2,32,1024,128,device='cuda:0')
K = torch.randn(2,32,1024,128,device='cuda:0')
V = torch.randn(2,32,1024,128,device='cuda:0')
# Inference using causal_linear_decoder
output = causal_linear_decoder(Q,K,V)
# If you want to input a mask with weight, set the is_mask_weight: True
gamma = torch.full((32,),0.5,device='cuda:0')
output = causal_linear_decoder(Q,K,V,is_mask_weight=True,gamma=gamma)
3. acknowledgement
method | Title | Paper | Code |
---|---|---|---|
causal_dot_product | Fast Transformers with Clustered Attention | arxiv | code |
Lighting Attention-2 | Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models | arxiv | code |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for efficient_linear_decoding-0.0.6.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | bba346387815c69ddbb76c5c2cbb22333a9f49a69108d406ebca614644b9363c |
|
MD5 | 81a00539231bc57311ce352e3dcebc36 |
|
BLAKE2b-256 | 5c976183f39d2da8f2f90c8a30e3e4e5b460616cd5a1ab2abdb84b7e3d689abd |
Close
Hashes for efficient_linear_decoding-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 666710b03620f4ceb755853a4a81c97b6865f339c1ece24c60c93b1927a489ee |
|
MD5 | 5b3990dbef0a43e47f1b13f5022d06f6 |
|
BLAKE2b-256 | 5c6eab14f6b2ed45c9c0b984e0f3fc78de3943795e1509af3a53a2a183434bbd |