TensorFlow Keras RNNs with trainable initial states
Treat the initial state(s) of TensorFlow Keras recurrent neural network (RNN) layers as a parameter or parameters to be learned during training, as recommended in, e.g., .
Ordinary RNNs use an all-zero initial state by default. Why not let the neural network learn a smarter initial state?
The trainable-initial-state-rnn package provides a class TrainableInitialStateRNN that can wrap any tf.keras RNN (or bidirectional RNN) and manage new initial state variables in addition to the RNN’s weights.
Typical usage looks as follows.
import tensorflow as tf from trainable_initial_state_rnn import TrainableInitialStateRNN base_rnn = tf.keras.layers.LSTM(256) rnn = TrainableInitialStateRNN(base_rnn) # Treats initial state as a variable! model = tf.keras.Model(...) # Use rnn like any other tf.keras layer in your model model.compile(...) history = model.fit(...)
Documentation is available at Read the Docs.
- Python >= 3.7
- TensorFlow >= 2.1
pip install git+https://github.com/artemmavrin/trainable-initial-state-rnn.git
Alternatively, install a recent release from the Python Package Index (PyPI):
pip install trainable-initial-state-rnn
Note. To install the project for development (e.g., to make changes to the source code), clone the project repository from GitHub and run
git clone https://github.com/artemmavrin/trainable-initial-state-rnn.git cd trainable-initial-state-rnn # Optional but recommended: create and activate a new Python virtual environment make dev
This will additionally install the requirements needed to run tests, check code coverage, and produce documentation.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size trainable_initial_state_rnn-0.0.1-py3-none-any.whl (10.7 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size trainable-initial-state-rnn-0.0.1.tar.gz (6.3 kB)||File type Source||Python version None||Upload date||Hashes View hashes|
Hashes for trainable_initial_state_rnn-0.0.1-py3-none-any.whl
Hashes for trainable-initial-state-rnn-0.0.1.tar.gz