Skip to main content

An easy-to-use library for GLU (Gated Linear Units) and GLU variants in TensorFlow.

Project description

GLU

PyPI Lint Code Base Upload Python Package Code style: black

GitHub stars GitHub followers Twitter Follow

An easy-to-use library for GLU (Gated Linear Units) and GLU variants in TensorFlow. This repository allows you to easily make use of the following activation functions:

  • GLU introduced in the paper Language Modeling with Gated Convolutional Networks [1]
  • Bilinear introduced in the paper Language Modeling with Gated Convolutional Networks [1] atrributed to Mnih et al. [2]
  • ReGLU introduced in the paper GLU Variants Improve Transformer [3]
  • GEGLU introduced in the paper GLU Variants Improve Transformer [3]
  • SwiGLU introduced in the paper GLU Variants Improve Transformer [3]
  • SeGLU

Gated Linear Units consist of the component-wise product of two linear projections, one of which is first passed through a sigmoid function. Variations on GLU are possible, using different nonlinear (or even linear) functions in place of sigmoid. In the GLU Variants Improve Transformer [3] paper, in a fine-tuning scenario the new variants seem to produce better perplexities for the de-noising objective used in pre-training, as well as better results on many downstream language-understanding tasks. Furthermore these do not have any apparent computational drawbacks.

Installation

Run the following to install:

pip install glu-tf

Developing glu-tf

To install glu-tf, along with tools you need to develop and test, run the following in your virtualenv:

git clone https://github.com/Rishit-dagli/GLU.git
# or clone your own fork

cd GLU
pip install -e .[dev]

Usage

In this section, I show a minimal example of using the SwiGLU activation function but you can use the other activations in similar manner:

import tensorflow as tf
from glu_tf import SwiGLU

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(units=10)
model.add(SwiGLU(bias = False, dim=-1, name='swiglu'))

Want to Contribute 🙋‍♂️?

Awesome! If you want to contribute to this project, you're always welcome! See Contributing Guidelines. You can also take a look at open issues for getting more information about current or upcoming tasks.

Want to discuss? 💬

Have any questions, doubts or want to present your opinions, views? You're always welcome. You can start discussions.

References

[1] Dauphin, Yann N., et al. ‘Language Modeling with Gated Convolutional Networks’. ArXiv:1612.08083 [Cs], Sept. 2017. arXiv.org, http://arxiv.org/abs/1612.08083.

[2] Mnih, A., and Hinton, G. 2007. Three new graphical models for statistical language modelling. In Proceedings of the 24th international conference on Machine learning (pp. 641–648).

[3] Shazeer, Noam. ‘GLU Variants Improve Transformer’. ArXiv:2002.05202 [Cs, Stat], Feb. 2020. arXiv.org, http://arxiv.org/abs/2002.05202.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GLU-tf-0.1.0.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

GLU_tf-0.1.0-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file GLU-tf-0.1.0.tar.gz.

File metadata

  • Download URL: GLU-tf-0.1.0.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for GLU-tf-0.1.0.tar.gz
Algorithm Hash digest
SHA256 61ef3750ee2eadfbeb77c32e50a093a3dbdd82976aee9ea308aace4dbd7fd0fa
MD5 56aac065e2655311ed490476fea065a1
BLAKE2b-256 ab4969067da41713ecc126e0f45dd63152ad6facbf99ffde2c3abb3b100d065e

See more details on using hashes here.

File details

Details for the file GLU_tf-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: GLU_tf-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for GLU_tf-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 319d1067ee84617fb1e7bbef2b3a307e1d3dde7af02f942ca5aa8a40a1219384
MD5 1fdfa227c61fca7b0847d2e8dda282d8
BLAKE2b-256 c0d7276656ece1c7168706f6b7f8e2554e1f3c2a183a28119471b369256ae0da

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page