Skip to main content

FSA/FST algorithms, intended to (eventually) be interoperable with PyTorch and similar

Project description

k2

The vision of k2 is to be able to seamlessly integrate Finite State Automaton (FSA) and Finite State Transducer (FST) algorithms into autograd-based machine learning toolkits like PyTorch and TensorFlow. For speech recognition applications, this should make it easy to interpolate and combine various training objectives such as cross-entropy, CTC and MMI and to jointly optimize a speech recognition system with multiple decoding passes including lattice rescoring and confidence estimation. We hope k2 will have many other applications as well.

One of the key algorithms that we have implemented is pruned composition of a generic FSA with a "dense" FSA (i.e. one that corresponds to log-probs of symbols at the output of a neural network). This can be used as a fast implementation of decoding for ASR, and for CTC and LF-MMI training. This won't give a direct advantage in terms of Word Error Rate when compared with existing technology; but the point is to do this in a much more general and extensible framework to allow further development of ASR technology.

Implementation

A few key points on our implementation strategy.

Most of the code is in C++ and CUDA. We implement a templated class Ragged, which is quite like TensorFlow's RaggedTensor (actually we came up with the design independently, and were later told that TensorFlow was using the same ideas). Despite a close similarity at the level of data structures, the design is quite different from TensorFlow and PyTorch. Most of the time we don't use composition of simple operations, but rely on C++11 lambdas defined directly in the C++ implementations of algorithms. The code in these lambdas operate directly on data pointers and, if the backend is CUDA, they can run in parallel for each element of a tensor. (The C++ and CUDA code is mixed together and the CUDA kernels get instantiated via templates).

It is difficult to adequately describe what we are doing with these Ragged objects without going in detail through the code. The algorithms look very different from the way you would code them on CPU because of the need to avoid sequential processing. We are using coding patterns that make the most expensive parts of the computations "embarrassingly parallelizable"; the only somewhat nontrivial CUDA operations are generally reduction-type operations such as exclusive-prefix-sum, for which we use NVidia's cub library. Our design is not too specific to the NVidia hardware and the bulk of the code we write is fairly normal-looking C++; the nontrivial CUDA programming is mostly done via the cub library, parts of which we wrap with our own convenient interface.

The Finite State Automaton object is then implemented as a Ragged tensor templated on a specific data type (a struct representing an arc in the automaton).

Autograd

If you look at the code as it exists now, you won't find any references to autograd. The design is quite different to TensorFlow and PyTorch (which is why we didn't simply extend one of those toolkits). Instead of making autograd come from the bottom up (by making individual operations differentiable) we are implementing it from the top down, which is much more efficient in this case (and will tend to have better roundoff properties).

An example: suppose we are finding the best path of an FSA, and we need derivatives. We implement this by keeping track of, for each arc in the output best-path, which input arc it corresponds to. (For more complex algorithms an arc in the output might correspond to a sum of probabilities of a list of input arcs). We can make this compatible with PyTorch/TensorFlow autograd at the Python level, by, for example, defining a Function class in PyTorch that remembers this relationship between the arcs and does the appropriate (sparse) operations to propagate back the derivatives w.r.t. the weights.

Current state of the code

We have wrapped all the C++ code to Python with pybind11 and have finished the integration with PyTorch.

We are currently writing speech recognition recipes using k2, which are hosted in a separate repository. Please see https://github.com/k2-fsa/icefall.

Plans after initial release

We are currently trying to make k2 ready for production use (see the branch v2.0-pre).

Quick start

Want to try it out without installing anything? We have setup a Google Colab. You can find more Colab notebooks using k2 in speech recognition at https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html.

Project details


Release history Release notifications | RSS feed

This version

1.14

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

k2-1.14-py38-none-any.whl (64.5 MB view details)

Uploaded Python 3.8

k2-1.14-py37-none-any.whl (64.5 MB view details)

Uploaded Python 3.7

k2-1.14-py36-none-any.whl (64.5 MB view details)

Uploaded Python 3.6

k2-1.14-cp38-cp38-macosx_10_15_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.8macOS 10.15+ x86-64

k2-1.14-cp37-cp37m-macosx_10_15_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.7mmacOS 10.15+ x86-64

k2-1.14-cp36-cp36m-macosx_10_15_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.6mmacOS 10.15+ x86-64

File details

Details for the file k2-1.14-py38-none-any.whl.

File metadata

  • Download URL: k2-1.14-py38-none-any.whl
  • Upload date:
  • Size: 64.5 MB
  • Tags: Python 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for k2-1.14-py38-none-any.whl
Algorithm Hash digest
SHA256 0dbb5eb7b9009755367674307ecba8d87be6a6330c29aef64f4e147ed7b5ad4a
MD5 db8e5b64d44a1a72bcd36b5d602fbffd
BLAKE2b-256 dd2847791252beeecf6ef998bfc418e0532cda3c422c0cb500b78ab78d74c526

See more details on using hashes here.

File details

Details for the file k2-1.14-py37-none-any.whl.

File metadata

  • Download URL: k2-1.14-py37-none-any.whl
  • Upload date:
  • Size: 64.5 MB
  • Tags: Python 3.7
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.12

File hashes

Hashes for k2-1.14-py37-none-any.whl
Algorithm Hash digest
SHA256 ae9f09ff22ebaae5401daf49aaf9b1aee3fcec2aadbf55d16bdc2008befec904
MD5 c964257f0a9321346335a0d432277aa5
BLAKE2b-256 3a0b1bbb85a20b1dcff79dcb5f90ac8987bd1c38161a4511a2f4a722214403c2

See more details on using hashes here.

File details

Details for the file k2-1.14-py36-none-any.whl.

File metadata

  • Download URL: k2-1.14-py36-none-any.whl
  • Upload date:
  • Size: 64.5 MB
  • Tags: Python 3.6
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.15

File hashes

Hashes for k2-1.14-py36-none-any.whl
Algorithm Hash digest
SHA256 51c66ef154987cfa1d7d77ff3dc7c305e5d58ccb948457b6d9ceb52492f80621
MD5 e3d47ecf440f368e82647ede0648bed9
BLAKE2b-256 1288e3112e576d7bdc500aed57f213bd99093447ea1a4a0e527ba50d8def187c

See more details on using hashes here.

File details

Details for the file k2-1.14-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: k2-1.14-cp38-cp38-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.8, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for k2-1.14-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 b7679dd3488291b94ae3d3302cf2056fb8ba8d02e97e53b932d098d5af2824ef
MD5 c6b916ad01399b6c92b334ccbdddc478
BLAKE2b-256 5b56bcd37c705ec65ded69fa830d03ecb702c6d6880c50ba654397dc9af5f77f

See more details on using hashes here.

File details

Details for the file k2-1.14-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: k2-1.14-cp37-cp37m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.12

File hashes

Hashes for k2-1.14-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 9eedd18702d55ec517a2037150991056673aa623f2e8e7391db1b3b7012509d4
MD5 aee0f9580ed5e38c952382ab419261b5
BLAKE2b-256 431e76ca42f06ffb64d5c8871fec23d476acb8ebf50cdf785deb45a095c5bcfb

See more details on using hashes here.

File details

Details for the file k2-1.14-cp36-cp36m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: k2-1.14-cp36-cp36m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.6m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.15

File hashes

Hashes for k2-1.14-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 b8d0dbf2c69ce640f123391d07ea9ccb67bae7868d9a1ad1d081f1f5c858cf6f
MD5 9bef6bb7c9592e4064ba22ad8cf8fb47
BLAKE2b-256 0b08278864594725ecbc6e853279cf32992917e5ee2881dd4a1b6137e2cfd200

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page