Skip to main content

Haste: a fast, simple, and open RNN library.

Project description


GitHub release (latest SemVer including pre-releases) Open In Colab GitHub

Haste is a CUDA implementation of fused RNN layers with built-in DropConnect and Zoneout regularization. These layers are exposed through C++ and Python APIs for easy integration into your own projects or machine learning frameworks.

Which RNN types are supported?

What's included in this project?

  • a standalone C++ API (libhaste)
  • a TensorFlow Python API (haste_tf)
  • a PyTorch API (haste_pytorch)
  • examples for writing your own custom C++ inference / training code using libhaste
  • benchmarking programs to evaluate the performance of RNN implementations

For questions or feedback about Haste, please open an issue on GitHub or send us an email at haste@lmnt.com.

Install

Here's what you'll need to get started:

Once you have the prerequisites, you can install with pip or by building the source code.

Using pip

pip install haste_pytorch
pip install haste_tf

Building from source

make               # Build everything
make haste         # ;) Build C++ API
make haste_tf      # Build TensorFlow API
make haste_pytorch # Build PyTorch API
make examples
make benchmarks

If you built the TensorFlow or PyTorch API, install it with pip:

pip install haste_tf-*.whl
pip install haste_pytorch-*.whl

If the CUDA Toolkit that you're building against is not in /usr/local/cuda, you must specify the $CUDA_HOME environment variable before running make:

CUDA_HOME=/usr/local/cuda-10.2 make

Performance

Our LSTM and GRU benchmarks indicate that Haste has the fastest publicly available implementation for nearly all problem sizes. The following charts show our LSTM results, but the GRU results are qualitatively similar.

Here is our complete LSTM benchmark result grid:
N=1 C=64 N=1 C=128 N=1 C=256 N=1 C=512
N=32 C=64 N=32 C=128 N=32 C=256 N=32 C=512
N=64 C=64 N=64 C=128 N=64 C=256 N=64 C=512
N=128 C=64 N=128 C=128 N=128 C=256 N=128 C=512

Documentation

TensorFlow API

import haste_tf as haste

gru_layer = haste.GRU(num_units=256, direction='bidirectional', zoneout=0.1, dropout=0.05)
indrnn_layer = haste.IndRNN(num_units=256, direction='bidirectional', zoneout=0.1)
lstm_layer = haste.LSTM(num_units=256, direction='bidirectional', zoneout=0.1, dropout=0.05)
norm_gru_layer = haste.LayerNormGRU(num_units=256, direction='bidirectional', zoneout=0.1, dropout=0.05)
norm_lstm_layer = haste.LayerNormLSTM(num_units=256, direction='bidirectional', zoneout=0.1, dropout=0.05)

# `x` is a tensor with shape [N,T,C]
x = tf.random.normal([5, 25, 128])

y, state = gru_layer(x, training=True)
y, state = indrnn_layer(x, training=True)
y, state = lstm_layer(x, training=True)
y, state = norm_gru_layer(x, training=True)
y, state = norm_lstm_layer(x, training=True)

The TensorFlow Python API is documented in docs/tf/haste_tf.md.

PyTorch API

import torch
import haste_pytorch as haste

gru_layer = haste.GRU(input_size=128, hidden_size=256, zoneout=0.1, dropout=0.05)
indrnn_layer = haste.IndRNN(input_size=128, hidden_size=256, zoneout=0.1)
lstm_layer = haste.LSTM(input_size=128, hidden_size=256, zoneout=0.1, dropout=0.05)
norm_gru_layer = haste.LayerNormGRU(input_size=128, hidden_size=256, zoneout=0.1, dropout=0.05)
norm_lstm_layer = haste.LayerNormLSTM(input_size=128, hidden_size=256, zoneout=0.1, dropout=0.05)

gru_layer.cuda()
indrnn_layer.cuda()
lstm_layer.cuda()
norm_gru_layer.cuda()
norm_lstm_layer.cuda()

# `x` is a CUDA tensor with shape [T,N,C]
x = torch.rand([25, 5, 128]).cuda()

y, state = gru_layer(x)
y, state = indrnn_layer(x)
y, state = lstm_layer(x)
y, state = norm_gru_layer(x)
y, state = norm_lstm_layer(x)

The PyTorch API is documented in docs/pytorch/haste_pytorch.md.

C++ API

The C++ API is documented in lib/haste/*.h and there are code samples in examples/.

Code layout

  • benchmarks/: programs to evaluate performance of RNN implementations
  • docs/tf/: API reference documentation for haste_tf
  • docs/pytorch/: API reference documentation for haste_pytorch
  • examples/: examples for writing your own C++ inference / training code using libhaste
  • frameworks/tf/: TensorFlow Python API and custom op code
  • frameworks/pytorch/: PyTorch API and custom op code
  • lib/: CUDA kernels and C++ API
  • validation/: scripts to validate output and gradients of RNN layers

Implementation notes

  • the GRU implementation is based on 1406.1078v1 (same as cuDNN) rather than 1406.1078v3
  • Zoneout on LSTM cells is applied to the hidden state only, and not the cell state
  • the layer normalized LSTM implementation uses these equations

References

  1. Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
  2. Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv:1406.1078 [cs, stat]. http://arxiv.org/abs/1406.1078.
  3. Wan, L., Zeiler, M., Zhang, S., Cun, Y. L., & Fergus, R. (2013). Regularization of Neural Networks using DropConnect. In International Conference on Machine Learning (pp. 1058–1066). Presented at the International Conference on Machine Learning. http://proceedings.mlr.press/v28/wan13.html.
  4. Krueger, D., Maharaj, T., Kramár, J., Pezeshki, M., Ballas, N., Ke, N. R., et al. (2017). Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations. arXiv:1606.01305 [cs]. http://arxiv.org/abs/1606.01305.
  5. Ba, J., Kiros, J.R., & Hinton, G.E. (2016). Layer Normalization. arXiv:1607.06450 [cs, stat]. https://arxiv.org/abs/1607.06450.
  6. Li, S., Li, W., Cook, C., Zhu, C., & Gao, Y. (2018). Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN. arXiv:1803.04831 [cs]. http://arxiv.org/abs/1803.04831.

Citing this work

To cite this work, please use the following BibTeX entry:

@misc{haste2020,
  title  = {Haste: a fast, simple, and open RNN library},
  author = {Sharvil Nanavati},
  year   = 2020,
  month  = "Jan",
  howpublished = {\url{https://github.com/lmnt-com/haste/}},
}

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haste_pytorch-0.5.0rc0.tar.gz (44.4 kB view details)

Uploaded Source

File details

Details for the file haste_pytorch-0.5.0rc0.tar.gz.

File metadata

  • Download URL: haste_pytorch-0.5.0rc0.tar.gz
  • Upload date:
  • Size: 44.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.9

File hashes

Hashes for haste_pytorch-0.5.0rc0.tar.gz
Algorithm Hash digest
SHA256 43db30d2fc42f37bcaa233ba2b859cc9deebb35fe1db55c6b474a62dca5ea6e4
MD5 151d7e2a439f559ffe30ca32c2b51eb5
BLAKE2b-256 ae6e339ec1f4e9ab302a514631d5f45cc1a82bf11b9473a9b4a46cce94477b72

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page