Attention Free Transformer - Pytorch
Project description
aft-pytorch
Unofficial PyTorch implementation of the Attention Free Transformer by Zhai, et al. [abs, pdf] from Apple Inc.
Installation
You can install aft-pytorch
via pip
:
pip install aft-pytorch
Usage
You can import the "AFT-Full" layer (as described in the paper) from the package like so:
from aft_pytorch import AFTFullAttention
layer = AFTFullAttention(
dim=512,
hidden_dim=64,
heads=8
)
# a batch of sequences with 10 timesteps of length 512 each
x = torch.rand(32, 10, 512)
y = layer(x) # [32, 10, 512]
TODO
- Add full AFT architecture
Contributing
If you like this repo, please leave a star! If there are any ammends or suggestions, feel free to raise a PR/issue.
Credits
@misc{
zhai2021an,
title={An Attention Free Transformer},
author={Shuangfei Zhai and Walter Talbott and Nitish Srivastava and Chen Huang and Hanlin Goh and Joshua M. Susskind},
year={2021},
url={https://openreview.net/forum?id=pW--cu2FCHY}
}
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
aft_pytorch-0.0.2.1.tar.gz
(3.3 kB
view hashes)
Built Distribution
Close
Hashes for aft_pytorch-0.0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | db4b70f4ae9372c0539ee3d78bc9a9a6caca8644d7fc80da1b1e5f33bc815fda |
|
MD5 | 861f8636fafaec3de15e2189ab9d1381 |
|
BLAKE2b-256 | b5383e334b81faf5fd11d06acbca848e49119aac049ce5c5dbcf13b6d48d3123 |