Skip to main content

Efficient computation library for linear attention.

Project description

An efficient Linear Attention Decoding package

1. installation

conda create -n efficient_linear_decoding python=3.9
conda activate efficient_linear_decoding
pip install efficient_linear_decoding

The code has been test under the following environment:

triton>=2.1.0
torch>=2.1.0
pycuda
pynvml
numpy<2

You can use the following command to install:

pip install triton==2.1.0
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install pycuda
pip install pynvml
pip install numpy

2. usage

import torch
from efficient_linear_decoding.efficient_linear_decoding import causal_linear_decoder

# Create input tensor
Q = torch.randn(2,32,1024,128,device='cuda:0')
K = torch.randn(2,32,1024,128,device='cuda:0')
V = torch.randn(2,32,1024,128,device='cuda:0')

# Inference using causal_linear_decoder
output = causal_linear_decoder(Q,K,V)

# If you want to input a mask with weight, set the is_mask_weight: True
gamma = torch.full((32,),0.5,device='cuda:0')
output = causal_linear_decoder(Q,K,V,is_mask_weight=True,gamma=gamma)

3. acknowledgement

method Title Paper Code
causal_dot_product Fast Transformers with Clustered Attention arxiv code
Lighting Attention-2 Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models arxiv code

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

efficient_linear_decoding-0.0.7.tar.gz (24.8 kB view details)

Uploaded Source

File details

Details for the file efficient_linear_decoding-0.0.7.tar.gz.

File metadata

File hashes

Hashes for efficient_linear_decoding-0.0.7.tar.gz
Algorithm Hash digest
SHA256 1f2be51fcc25b35b3fadf3847a567391dad63cf065bb5a10138710b6f0cd6d04
MD5 20369b017f3bdb34dfcc33f9d8e9f4bf
BLAKE2b-256 e2d0605e90a1ee674a58effefae998bc7dddd6e868f58521a717f05863aac44c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page