Attention mechanism for processing sequence data that considers the context for each timestamp
Project description
Attention mechanism for processing sequence data that considers the context for each timestamp.
Install
pip install keras-self-attention
Usage
Basic
By default, the attention layer uses additive attention and considers the whole context while calculating the relevance. The following code creates an attention layer that follows the equations in the first section (attention_activation is the activation function of e_{t, t'}):
import keras
from keras_self_attention import Attention
model = keras.models.Sequential()
model.add(keras.layers.Embedding(input_dim=10000,
output_dim=300,
mask_zero=True))
model.add(keras.layers.Bidirectional(keras.layers.LSTM(units=128,
return_sequences=True)))
model.add(Attention(attention_activation='sigmoid'))
model.add(keras.layers.Dense(units=5))
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['categorical_accuracy'],
)
model.summary()
Local Attention
The global context may be too broad for one piece of data. The parameter attention_width controls the width of the local context:
from keras_self_attention import Attention
Attention(
attention_width=15,
attention_activation='sigmoid',
name='Attention',
)
Multiplicative Attention
You can use multiplicative attention by setting attention_type:
from keras_self_attention import Attention
Attention(
attention_width=15,
attention_type=Attention.ATTENTION_TYPE_MUL,
attention_activation=None,
kernel_regularizer=keras.regularizers.l2(1e-6),
use_attention_bias=False,
name='Attention',
)
Regularizer
To use the regularizer, set attention_regularizer_weight to a positive number:
import keras
from keras_self_attention import Attention
inputs = keras.layers.Input(shape=(None,))
embd = keras.layers.Embedding(input_dim=32,
output_dim=16,
mask_zero=True)(inputs)
lstm = keras.layers.Bidirectional(keras.layers.LSTM(units=16,
return_sequences=True))(embd)
att, weights = Attention(attention_type=Attention.ATTENTION_TYPE_MUL,
kernel_regularizer=keras.regularizers.l2(1e-4),
bias_regularizer=keras.regularizers.l1(1e-4),
attention_regularizer_weight=1e-4,
name='Attention')(lstm)
dense = keras.layers.Dense(units=5, name='Dense')(att)
model = keras.models.Model(inputs=inputs, outputs=[dense, weights])
model.compile(
optimizer='adam',
loss={'Dense': 'sparse_categorical_crossentropy'},
metrics={'Dense': 'categorical_accuracy'},
)
model.summary(line_length=100)
model.fit(
x=x,
y=numpy.zeros((batch_size, sentence_len, 1)),,
epochs=10,
)
Load the Model
Make sure to add Attention to custom objects:
import keras
keras.models.load_model(model_path, custom_objects={
'Attention': Attention,
})
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for keras-self-attention-0.0.19.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 72aef482c37f172f7b383044a01040764d80d5f05685cfd61726daacb7847535 |
|
MD5 | 21becd48ff485b5ae0fd07dff083e18e |
|
BLAKE2b-256 | 162248456f14d6cda6d1bb6a89a9c98f99d43338f5d9a288b4d40ba736568bcd |