Attention mechanism for processing sequence data that considers the context for each timestamp

## Project description

Attention mechanism for processing sequence data that considers the context for each timestamp.

## Install

pip install keras-self-attention

## Usage

### Basic

By default, the attention layer uses additive attention and considers the whole context while calculating the relevance. The following code creates an attention layer that follows the equations in the first section (attention_activation is the activation function of e_{t, t'}):

import keras
from keras_self_attention import Attention

model = keras.models.Sequential()
output_dim=300,
return_sequences=True)))
model.compile(
loss='categorical_crossentropy',
metrics=['categorical_accuracy'],
)
model.summary()

### Local Attention

The global context may be too broad for one piece of data. The parameter attention_width controls the width of the local context:

from keras_self_attention import Attention

Attention(
attention_width=15,
attention_activation='sigmoid',
name='Attention',
)

### Multiplicative Attention

You can use multiplicative attention by setting attention_type:

from keras_self_attention import Attention

Attention(
attention_width=15,
attention_type=Attention.ATTENTION_TYPE_MUL,
kernel_regularizer=keras.regularizers.l2(1e-6),
use_attention_bias=False,
name='Attention',
)

## Project details

Uploaded source