No project description provided
Project description
Introduction
This project implements the optimization techniques proposed in Improving RNN Transducer Modeling for End-to-End Speech Recognition to reduce the memory consumption for computing transducer loss.
How does it differ from the RNN-T loss from torchaudio
Actually, the implementation is based on torchaudio, so the two are functionally equivalent, i.e., they produce the same output for the same input.
However, this project is more memory efficient and potentially faster (TODO: This needs some benchmarks)
How does it differ from warp-transducer
I don't have much experience with warp-transducer. But I know that warp-transducer produces different gradients for CPU and CUDA when using the same input. See https://github.com/HawkAaron/warp-transducer/issues/93
optimized_transducer
uses less memory than that of warp-transducer.
Installation
You can install it via pip
:
pip install optimized_transducer
Installation FAQ
What operating systems are supported ?
It has been tested on Ubuntu 18.04. It should also work on macOS and other unixes systems. It may work on Windows, though it is not tested.
How to display installation log ?
Use
pip install --verbose optimized_transducer
How to reduce installation time ?
Use
export OT_MAKE_ARGS="-j"
pip install --verbose optimized_transducer
It will pass -j
to make
.
Which version of PyTorch is supported ?
It has been tested on PyTorch >= 1.5.0. It may work on PyTorch < 1.5.0
How to install a CPU version of optimized_transducer
?
Use
export OT_CMAKE_ARGS="-DCMAKE_BUILD_TYPE=Release -DOT_WITH_CUDA=OFF"
export OT_MAKE_ARGS="-j"
pip install --verbose optimized_transducer
It will pass -DCMAKE_BUILD_TYPE=Release -DOT_WITH_CUDA=OFF
to cmake
.
What Python versions are supported ?
Python >= 3.6 is known to work. It may work for Python 2.7, though it is not tested.
Where to get help if I have problems with the installation ?
Please file an issue at https://github.com/csukuangfj/optimized_transducer/issues and describe your problem there.
Usage
optimized_transducer
expects that the output shape of the joint network is
NOT (N, T, U, V)
, but is (sum_all_TU, V)
, which is a concatenation
of 2-D tensors: (T_1 * U_1, V)
, (T_2 * U_2, V)
, ..., (T_N, U_N, V)
.
Note: (T_1 * U_1, V)
is just the reshape of a 3-D tensor (T_1, U_1, V)
.
Suppose your original joint network looks somewhat like the following:
encoder_out = torch.rand(N, T, D) # from the encoder
decoder_out = torch.rand(N, U, D) # from the decoder, i.e., the prediction network
encoder_out = encoder_out.unsqueeze(2) # Now encoder out is (N, T, 1, D)
decoder_out = decoder_out.unsqueeze(1) # Now decoder out is (N, T, 1, D)
x = encoder_out + decoder_out # x is of shape (N, T, U, D)
activation = torch.tanh(x)
logits = linear(activation) # linear is an instance of `nn.Linear`.
loss = torchaudio.functional.rnnt_loss(
logits=logits,
targets=targets,
logit_lengths=logit_lengths,
target_lengths=target_lengths,
blank=blank_id,
reduction="mean",
)
You need to change it to the following:
encoder_out = torch.rand(N, T, D) # from the encoder
decoder_out = torch.rand(N, U, D) # from the decoder, i.e., the prediction network
encoder_out_list = [encoder_out[i, :logit_lengths[i], :] for i in range(N)]
decoder_out_list = [decoder_out[i, :target_lengths[i]+1, :] for i in range(N)]
x = [e.unsqueeze(1) + d.unsqueeze(0) for e, d in zip(encoder_out_list, decoder_out_list)]
x = [p.reshape(-1, D) for p in x]
x = torch.cat(x)
activation = torch.tanh(x)
logits = linear(activation) # linear is an instance of `nn.Linear`.
loss = optimized_transducer.transducer_loss(
logits=logits,
targets=targets,
logit_lengths=logit_lengths,
target_lengths=target_lengths,
blank=blank_id,
reduction="mean",
)
For more usages, please refer to
- https://github.com/csukuangfj/optimized_transducer/blob/master/optimized_transducer/python/optimized_transducer/transducer_loss.py
- https://github.com/csukuangfj/optimized_transducer/blob/master/optimized_transducer/python/tests/test_cuda.py
- https://github.com/csukuangfj/optimized_transducer/blob/master/optimized_transducer/python/tests/test_compute_transducer_loss.py
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.