Skip to main content

Keras Attention Layer

Project description

Keras Attention Mechanism

Downloads Downloads license dep1 Simple Keras Attention CI

Many-to-one attention mechanism for Keras.

Installation

PyPI

pip install attention

Example

import numpy as np
from tensorflow.keras import Input
from tensorflow.keras.layers import Dense, LSTM
from tensorflow.keras.models import load_model, Model

from attention import Attention


def main():
    # Dummy data. There is nothing to learn in this example.
    num_samples, time_steps, input_dim, output_dim = 100, 10, 1, 1
    data_x = np.random.uniform(size=(num_samples, time_steps, input_dim))
    data_y = np.random.uniform(size=(num_samples, output_dim))

    # Define/compile the model.
    model_input = Input(shape=(time_steps, input_dim))
    x = LSTM(64, return_sequences=True)(model_input)
    x = Attention(units=32)(x)
    x = Dense(1)(x)
    model = Model(model_input, x)
    model.compile(loss='mae', optimizer='adam')
    model.summary()

    # train.
    model.fit(data_x, data_y, epochs=10)

    # test save/reload model.
    pred1 = model.predict(data_x)
    model.save('test_model.h5')
    model_h5 = load_model('test_model.h5', custom_objects={'Attention': Attention})
    pred2 = model_h5.predict(data_x)
    np.testing.assert_almost_equal(pred1, pred2)
    print('Success.')


if __name__ == '__main__':
    main()

Other Examples

Browse examples.

Install the requirements before running the examples: pip install -r examples/examples-requirements.txt.

IMDB Dataset

In this experiment, we demonstrate that using attention yields a higher accuracy on the IMDB dataset. We consider two LSTM networks: one with this attention layer and the other one with a fully connected layer. Both have the same number of parameters for a fair comparison (250K).

Here are the results on 10 runs. For every run, we record the max accuracy on the test set for 10 epochs.

Measure No Attention (250K params) Attention (250K params)
MAX Accuracy 88.22 88.76
AVG Accuracy 87.02 87.62
STDDEV Accuracy 0.18 0.14

As expected, there is a boost in accuracy for the model with attention. It also reduces the variability between the runs, which is something nice to have.

Adding two numbers

Let's consider the task of adding two numbers that come right after some delimiters (0 in this case):

x = [1, 2, 3, 0, 4, 5, 6, 0, 7, 8]. Result is y = 4 + 7 = 11.

The attention is expected to be the highest after the delimiters. An overview of the training is shown below, where the top represents the attention map and the bottom the ground truth. As the training progresses, the model learns the task and the attention map converges to the ground truth.

Finding max of a sequence

We consider many 1D sequences of the same length. The task is to find the maximum of each sequence.

We give the full sequence processed by the RNN layer to the attention layer. We expect the attention layer to focus on the maximum of each sequence.

After a few epochs, the attention layer converges perfectly to what we expected.

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

attention-5.0.0.tar.gz (8.5 kB view details)

Uploaded Source

Built Distribution

attention-5.0.0-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file attention-5.0.0.tar.gz.

File metadata

  • Download URL: attention-5.0.0.tar.gz
  • Upload date:
  • Size: 8.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.10

File hashes

Hashes for attention-5.0.0.tar.gz
Algorithm Hash digest
SHA256 dec0734c8de45be9b15765b4b2fd5c952484246a8bbfa4953b81951948402b8e
MD5 31df6e2f394bbb8499b1b6d37718e8a6
BLAKE2b-256 c33f4f821fbcf4c401ec43b549b67d12bf5dd00eb4545378c336b09a17bdd9f3

See more details on using hashes here.

File details

Details for the file attention-5.0.0-py3-none-any.whl.

File metadata

  • Download URL: attention-5.0.0-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.10

File hashes

Hashes for attention-5.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5605b4b2fb5780f161b525819d94ebdf05ccf5aa5febbd70eeb9c6e9eea239bd
MD5 3178864cc0d20c1e7180fce6967f11c1
BLAKE2b-256 5559e43b191c104ba7f5f289acd11921511838fbab273c1164b954203cf8d966

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page