Simple implementations of attention modules adapted for the biological data domain

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

bio-attention

Simple implementations of attention modules adapted for the biological data domain.

Install

Since PyTorch is a dependency of h5torch, we recommend installing PyTorch independently first, as your system may require a specific version (e.g. CUDA drivers).

After PyTorch installation, h5torch can be installed using pip

pip install bio-attention

Usage

Package roadmap

Implement typing

LEGACY documentation

THIS REPO USED TO BE A 2D SLIDING WINDOW ATTENTION REPO

2D Sliding Window Attention

Stand-alone PyTorch implementation of 2D sliding window attention. Introduced by and part of CpG Transformer located at this repo and detailed in our preprint paper.

sliding_window_attn.py contains three PyTorch modules: RelPositionalWindowEmbedding, MultiDimWindowAttention, and MultiDimWindowTransformerLayer. The modules have been programmed in a way so that they can be used to do 1D sliding window attention, as well as >= 2-dimensional sliding window attention. In the multidimensional case, sliding window attention is applied over the first dimension following the batch dimension and full self-attention is applied over all the others.

Sliding windows are efficiently obtained using the unfold operation.

Positional embeddings are relative sinusoidal ones as described in Transformer-XL. Note that positional encodings are applied for the dimension in which sliding windows are applied. To inform the model of position in other dimensions, this should be encoded in the input itself.

Usage

from sliding_window_attn import MultiDimWindowTransformerLayer

# one layer:
layer = MultiDimWindowTransformerLayer(
    hidden_dim=64,     # number of input & output hidden dimensions (int)
    head_dim=8,        # hidden dimensionality of each SA head (int)
    n_head=8,          # number of SA heads (int)
    ff_dim=256,        # number of feed-forward hidden dimensions (int)
    window=21,         # window size of sliding window, should be odd. (int) (default=21)
    dropout=0.20,      # dropout rate on the self-attention matrix (float) (default=0.20)
    activation='relu', # activation used in feed-forward, either 'relu' or 'gelu' (str) (default='relu')
    layernorm=True     # whether to apply layernorm after attn+res and ff+res (bool) (default=True)
)

# model consisting of 4 layers:
model = nn.Sequential([MultiDimWindowTransformerLayer(64, 8, 8, 256),
                       MultiDimWindowTransformerLayer(64, 8, 8, 256),
                       MultiDimWindowTransformerLayer(64, 8, 8, 256),
                       MultiDimWindowTransformerLayer(64, 8, 8, 256)])



# 2D sequence input:
# batch size = 1
# sequence dim1 length = 512 (sliding window SA)
# sequence dim2 length = 4 (full SA)
# hidden = 64
x = torch.randn(1, 512, 4, 64)
pos = torch.cumsum(torch.randint(1, 7, (1, 512)), 1)
# if all positional indices follow on eachother by one: pos = torch.arange(512).unsqueeze(0)

x, pos = model((x, pos))

The same model can also be used for 1D sequence inputs:

# batch size = 1
# sequence dim1 length = 512 (sliding window SA)
# hidden = 64
x = torch.randn(1, 512, 64)
pos = torch.cumsum(torch.randint(1, 7, (1, 512)), 1)

x, pos = model((x, pos))

Or even 3D (or more) sequence input:

# batch size = 1
# sequence dim1 length = 512 (sliding window SA)
# sequence dim2 length = 4 (full SA)
# sequence dim3 length = 3 (full SA)
# hidden = 64
x = torch.randn(1, 512, 4, 3, 64)
pos = torch.cumsum(torch.randint(1, 7, (1, 512)), 1)

x, pos = model((x, pos))

Note that computational complexity will scale quadratically with each added dimension. For example: the attention matrix (per head) for the above 1D example is: 512 * 21. For the 2D example this becomes: (512*4) * (21*4). And for the 3D example: (512*4*3) * (21*4*3).

Citation

If you find this repository useful in your research, please cite our paper.

@article{dewaele2021cpg,
	author = {Gaetan De Waele and Jim Clauwaert and Gerben Menschaert and Willem Waegeman},
	title = {CpG Transformer for imputation of single-cell methylomes},
	year = {2021},
	doi = {10.1101/2021.06.08.447547},
	URL = {https://www.biorxiv.org/content/early/2021/06/09/2021.06.08.447547}
}

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.1.5

Oct 13, 2023

0.1.4

Oct 4, 2023

0.1.3

Sep 21, 2023

0.1.2

Sep 20, 2023

0.1.1

Sep 20, 2023

0.1.0

Jul 4, 2023

0.0.11

Jul 4, 2023

0.0.10

Jun 30, 2023

0.0.9

Jun 30, 2023

0.0.8

Jun 30, 2023

0.0.7

Jun 28, 2023

0.0.6

Jun 8, 2023

0.0.5

Jun 8, 2023

0.0.4

Jun 8, 2023

0.0.3

Jun 2, 2023

0.0.2

May 30, 2023

This version

0.0.1

Dec 6, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bio-attention-0.0.1.tar.gz (2.2 MB view hashes)

Uploaded Dec 6, 2022 Source

Built Distribution

bio_attention-0.0.1-py3-none-any.whl (7.3 kB view hashes)

Uploaded Dec 6, 2022 Python 3

Hashes for bio-attention-0.0.1.tar.gz

Hashes for bio-attention-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`220d11cf8ebca1a37ae45b14a970c46d2eebd23465a2c2775aa436797a6cf6de`
MD5	`d1a149ce1eb3295aba4000324d0d6286`
BLAKE2b-256	`3aa46bee18dbacf63084de05baed42ca930ac4c71306a3e811d5d9e4b0e88766`

Hashes for bio_attention-0.0.1-py3-none-any.whl

Hashes for bio_attention-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3a5a9735b66395b90f0007e00a2b6317fbb643a3c5801e99b9436f45057f64d1`
MD5	`ebdb5437d4f21da49995ccf995bcb80e`
BLAKE2b-256	`ba6c0392a572e5ab0cc99dee1b0a43f4451df328aa42f9f2b5240b922999309b`