Deep Learning library for attention based approaches.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Natural Language
- English
Programming Language

Project description

Fluence

Fluence is a deep learning library for transformers + self attention methods.

The library contains implementation for the following approaches (many more to come):

Adaptive Attention Span in Transformers
Adaptively Sparse Transformers
Reducing Transformer Depth on Demand with Structured Dropout
Optimizers: Lamb, Lookahead

Code Structure

fluence
    - adaptive     # Implements Adaptive Modules
    - models       # Models
    - optimizers   # optimizers 
    - tests        # Unit tests

Documentation

Please head to this link to learn how you can integrate fluence with your workflow

Usage

Right now, it consists of major adaptive computation approaches which have been tested with transformers. Here are some of the examples

Using Adaptive Attention Span

import torch
from fluence.adaptive.adaptive_span import AdaptiveSpan
config = {'attn_span': 1024, 'adapt_span_loss_coeff': 0.000005, 'adapt_span_ramp': 32,
                      'adapt_span_init': 0.002, 'adapt_span_cache': True, 'nb_heads': 12,'bs': 128,
                      'mask_size': [20,36]}
adaptive_span = AdaptiveSpan(**config)
adaptive_span.get_current_avg_span() # Returns average span
adaptive_span.get_current_max_span() # Returns maximum span
adaptive_span.get_trim_len() # Returns length that can be trimmed
adaptive_span.clamp_param() # Clamps values of parameter to stay between [0,1]

attention_scores_0 = torch.randn(128,12,26,36) # These scores come from softmax
attention_scores_1 = torch.randn(128,12,26,20) # These scores come from softmax
adaptive_span(attention_scores_0).shape # Soft masking function is multiplied
adaptive_span(attention_scores_1).shape

Using Entmax as a replacement for softmax with learnable alpha values

from fluence.adaptive.entmax import *
num_attention_heads = 12
entmax_alpha = EntmaxAlpha(num_attention_heads)
attention_scores = entmax_alpha(att_scores=torch.rand(128,12,26,36))

Using Layerdrop

from fluence.adaptive.layerdrop import LayerDrop
from torch import nn
net = nn.ModuleList([nn.Linear(2, 2) for i in range(3)])
layers_to_drop = 2
layerdrop = LayerDrop(net, layers_to_drop)
output = layerdrop(torch.rand(10,2))

fluence.models

Contains: - LXMERT (Cross modal (vision and language) transformer) with adaptive capabilities as fluence.adaptive

# Define a params file which contains configurations

params = {'adapt_span_enabled': False, 'attn_span': 1024, 'adapt_span_loss_coeff': 0.000005, 
          'adapt_span_ramp': 32, 'adapt_span_init': 0.002, 'adapt_span_cache': True, 'nb_heads': 12,
          'bs': 128, 'mask_size': [20,36], 'sparse_enabled': True, 'num_attention_heads': 4, 
          'layer_sizes': {'lang':9,'cross':5,'vision':5}, 'from_scratch': False, 
          'layerdrop_enabled': False, 'layerdrop_num_layers': 1, 'max_seq_len': 20}

# Right now, LXMERT requires features directly and not images

feat = torch.rand(128,36,2048)
# Pos is the ROI Features coming from detector (e.g Faster RCNN)

pos = torch.rand(128,36,4)
# Questions associated with images

ques = ['Are there any people in this photo?']*128

# Define how many answers are there in the dataset

model = LXMERT_Adaptive(3129, params)
logits = model(feat, pos, ques)

fluence.optimizer

from fluence.optimizers.lamb import Lamb
from fluence.optimizers.lookahead import Lookahead

model = torchvision.models.AlexNet()                        # Can be a transformer
base_optim = Lamb(params=model.parameters(),lr=1e-5, weight_decay=1.2e-6, min_trust=0.25)
optim = Lookahead(base_optimizer=base_optim, k=5, alpha=0.8)

Acknowledgements

Author: Prajjwal Bhargava (@prajjwal_1)

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

0.1.7

Aug 15, 2020

0.1.6

Jul 11, 2020

0.1.5

Jun 29, 2020

0.1.4

Jun 26, 2020

0.1.1

May 14, 2020

This version

0.1

May 13, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fluence-0.1.tar.gz (9.8 kB view hashes)

Uploaded May 13, 2020 Source

Built Distribution

fluence-0.1-py3-none-any.whl (16.4 kB view hashes)

Uploaded May 13, 2020 Python 3

Hashes for fluence-0.1.tar.gz

Hashes for fluence-0.1.tar.gz
Algorithm	Hash digest
SHA256	`1a47a5374a5a9efcda7dfd1145b1d006be65510fd979bd1cd6735f8015ddfb3d`
MD5	`f88f33d1acefaa2a14529246c7643214`
BLAKE2b-256	`de68e16153c8644c00028bba5b2b37ff27efcaea5110d401ca57913f181221ed`

Hashes for fluence-0.1-py3-none-any.whl

Hashes for fluence-0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7322b3da0a6e1b52da49ae828ed7fc7a6c4b98de8e23c60b43c32801b4be62af`
MD5	`8cd38d0fe3d52c16b024b534e598ffd6`
BLAKE2b-256	`7e6e1935bca710c0089ac0e3828467fd173e443bc370f0f87a09cfb11a023c20`