Skip to main content

PyTorch edit-distance functions

Project description

PyTorch edit-distance functions

Useful functions for E2E Speech Recognition training with PyTorch and CUDA.

Here is a simple use case with Reinforcement Learning and RNN-T loss:

blank = torch.tensor([0], dtype=torch.int).cuda()
space = torch.tensor([1], dtype=torch.int).cuda()

xs = model.greedy_decode(xs, sampled=True)

torch_edit_distance.remove_blank(xs, xn, blank)

rewards = 1 - torch_edit_distance.compute_wer(xs, ys, xn, yn, blank, space)

nll = rnnt_loss(zs, ys, xn, yn)

loss = nll * rewards

levenshtein_distance

Levenshtein edit-distance with detailed statistics for ins/del/sub operations.

collapse_repeated

Merge repeated tokens, useful for CTC-based model.

remove_blank

Remove unnecessary blank tokens, useful for CTC, RNN-T, RNA models.

strip_separator

Remove leading, trailing and repeated middle separators.

Requirements

  • C++11 compiler (tested with GCC 5.4).
  • Python: 3.5, 3.6, 3.7 (tested with version 3.6).
  • PyTorch >= 1.0.0 (tested with version 1.1.0).
  • CUDA Toolkit (tested with version 10.0).

Install

There is no compiled version of the package. The following setup instructions compile the package from the source code locally.

From Pypi

pip install torch_edit_distance

From GitHub

git clone https://github.com/1ytic/pytorch-edit-distance
cd pytorch-edit-distance
python setup.py install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torch_edit_distance-0.3.0.tar.gz (7.4 kB view details)

Uploaded Source

File details

Details for the file torch_edit_distance-0.3.0.tar.gz.

File metadata

  • Download URL: torch_edit_distance-0.3.0.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.9

File hashes

Hashes for torch_edit_distance-0.3.0.tar.gz
Algorithm Hash digest
SHA256 dd92879a5319e5f154309b62a0d722dd55b3e2b022147dd649e77f0fc2f6e1e4
MD5 02bb9ad2ca886d73fcedb5e0b3480e39
BLAKE2b-256 23d3d0705d082761520da075dc3816cf3aa5d6a96df1101eb07042af0f0c3718

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page