Efficient computation library for linear attention.
Project description
An efficient Linear Attention Decoding package
1. installation
conda create -n efficient_linear_decoding python=3.9
conda activate efficient_linear_decoding
pip install efficient_linear_decoding
The code has been test under the following environment:
triton>=2.1.0
torch>=2.1.0
pycuda
pynvml
numpy<2
You can use the following command to install:
pip install triton==2.1.0
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install pycuda
pip install pynvml
pip install numpy
2. usage
import torch
from efficient_linear_decoding.efficient_linear_decoding import causal_linear_decoder
# Create input tensor
Q = torch.randn(2,32,1024,128,device='cuda:0')
K = torch.randn(2,32,1024,128,device='cuda:0')
V = torch.randn(2,32,1024,128,device='cuda:0')
# Inference using causal_linear_decoder
output = causal_linear_decoder(Q,K,V)
# If you want to input a mask with weight, set the is_mask_weight: True
gamma = torch.full((32,),0.5,device='cuda:0')
output = causal_linear_decoder(Q,K,V,is_mask_weight=True,gamma=gamma)
3. acknowledgement
method | Title | Paper | Code |
---|---|---|---|
causal_dot_product | Fast Transformers with Clustered Attention | arxiv | code |
Lighting Attention-2 | Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models | arxiv | code |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file efficient_linear_decoding-0.0.7.tar.gz
.
File metadata
- Download URL: efficient_linear_decoding-0.0.7.tar.gz
- Upload date:
- Size: 24.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1f2be51fcc25b35b3fadf3847a567391dad63cf065bb5a10138710b6f0cd6d04 |
|
MD5 | 20369b017f3bdb34dfcc33f9d8e9f4bf |
|
BLAKE2b-256 | e2d0605e90a1ee674a58effefae998bc7dddd6e868f58521a717f05863aac44c |