Simple implementations of attention modules adapted for the biological data domain
Project description
Why use this package?
There are already plenty of excellent implementations out there that allow you to test out the countless variants of transformers [1], [2]. This repository primarily separates itself from the previous in that it contains positional encodings schemes adapted to allow for irregularly-spaced positions in sequences. This is useful in, for example: (1) the mass spectral domain (proteomics, metabolomics, ...), where transformers operate on sets of peaks, (2) any kind of (epi)genomic data that measures sites of interests on the genome that are irregularly-spaced (such as WGBS/CpG sites, ATAC-seq/chromatin accessibility, ...). Additionally, the attention definitions in this repository are compatible with multi-dimensional data, such as the MSAs used in some protein language models, and AlphaFold.
Install
Since PyTorch is a dependency of bio-attention
, we recommend installing PyTorch independently first, as your system may require a specific version (e.g. CUDA drivers).
After PyTorch installation, bio-attention
can be installed using pip
pip install bio-attention
Note
This package used to be a 2D sliding window attention package. The current formulation of the package does not allow for this type of attention anymore (instead, I recommend to perform axial attention with alternating sliding window attention across one axis and full self-attention across the other). If you want to use 2D sliding window attention, check out the old version of this repo.
Usage
Package roadmap
- Embedding layers
- Continuous
- Discrete
- Binary
- Bin
- [~] Positional encoding schemes
- Sinusoidal
- Embedding
- Continuous
- Rotary
- AliBi
- DPB
- XL
- Test support for multi-dimensional inputs
- [~] Attention modules
- Vanilla
- Windowed
- Random
- Performer
- Encoder
- Decoder
- Cross
- Support for multi-dim inputs
- Add a warning if non-increasing positional indices are used with a decoder attention
- Add docs clarifying that clf tokens are automatically accounted for if no pos is provided for them
- Tests
- Typing
- Docs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bio_attention-0.1.7.tar.gz
.
File metadata
- Download URL: bio_attention-0.1.7.tar.gz
- Upload date:
- Size: 2.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e7c22d0e21b3e4f0d6a44dea48cfe4ae187d92f025d4ac447e3adb323d59c2c |
|
MD5 | 525c90467817fc1882941909c60fd630 |
|
BLAKE2b-256 | b415a56ff261e81c408d4ef50c86d9c7cdecef448e055034d69d3bca6ce54c0a |
File details
Details for the file bio_attention-0.1.7-py3-none-any.whl
.
File metadata
- Download URL: bio_attention-0.1.7-py3-none-any.whl
- Upload date:
- Size: 16.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 320afd0e98738d21a102824cc10d257af3aa80b6c1c86047097878400a357f3a |
|
MD5 | 880912f63bee373fe8f10a56b08d722f |
|
BLAKE2b-256 | eedde3e5b8b402c786e5b7f2605268c6c6f3c91784bf81befaa26a47461976a4 |