Skip to main content

Simple Python/Cython interface to kaldi-asr nnet3/chain and gmm decoders

Project description

# py-kaldi-asr

Some simple wrappers around kaldi-asr intended to make using kaldi's online nnet3-chain
decoders as convenient as possible. Kaldi's online GMM decoders are also supported.

Target audience are developers who would like to use kaldi-asr as-is for speech
recognition in their application on GNU/Linux operating systems.

Constructive comments, patches and pull-requests are very welcome.

Getting Started
===============

We recommend using pre-trained modules from the [zamia-speech](http://zamia-speech.org/) project
to get started. There you will also find a tutorial complete with links to pre-built binary packages
to get you up and running with free and open source speech recognition in a matter of minutes:

[Zamia Speech Tutorial](https://github.com/gooofy/zamia-speech#get-started-with-our-pre-trained-models)

Example Code
------------

Simple wav file decoding:

```python
from kaldiasr.nnet3 import KaldiNNet3OnlineModel, KaldiNNet3OnlineDecoder

MODELDIR = 'data/models/kaldi-generic-en-tdnn_sp-latest'
WAVFILE = 'data/dw961.wav'

kaldi_model = KaldiNNet3OnlineModel (MODELDIR)
decoder = KaldiNNet3OnlineDecoder (kaldi_model)

if decoder.decode_wav_file(WAVFILE):

s, l = decoder.get_decoded_string()

print
print u"*****************************************************************"
print u"**", WAVFILE
print u"**", s
print u"** %s likelihood:" % MODELDIR, l
print u"*****************************************************************"
print

else:

print "***ERROR: decoding of %s failed." % WAVFILE
```

Please check the examples directory for more example code.

Requirements
============

* Python 2.7 or 3.5
* NumPy
* Cython
* [kaldi-asr](http://kaldi-asr.org/ "kaldi-asr.org")

Setup Notes
===========

Source
------

At the time of this writing kaldi-asr does not seem to have an official way to
install it on a system.

So, for now we will rely on pkg-config to provide LIBS and CFLAGS for compilation:
Create a file called `kaldi-asr.pc` somewhere in your `PKG_CONFIG_PATH` that provides
this information:

```bash
kaldi_root=/opt/kaldi

Name: kaldi-asr
Description: kaldi-asr speech recognition toolkit
Version: 5.2
Requires: atlas
Libs: -L${kaldi_root}/tools/openfst/lib -L${kaldi_root}/src/lib -lkaldi-decoder -lkaldi-lat -lkaldi-fstext -lkaldi-hmm -lkaldi-feat -lkaldi-transform -lkaldi-gmm -lkaldi-tree -lkaldi-util -lkaldi-matrix -lkaldi-base -lkaldi-nnet3 -lkaldi-online2 -lkaldi-cudamatrix -lkaldi-ivector -lfst
Cflags: -I${kaldi_root}/src -I${kaldi_root}/tools/openfst/include
```

make sure `kaldi_root` points to wherever your kaldi checkout lives in your filesystem.

ATLAS
-----

You may need to install ATLAS headers even if you didn't need them to compile Kaldi.

```
$ sudo apt install libatlas-dev
```

License
=======

My own code is Apache licensed unless otherwise noted in the script's copyright
headers.

Some scripts and files are based on works of others, in those cases it is my
intention to keep the original license intact. Please make sure to check the
copyright headers inside for more information.

Author
======

Guenter Bartsch <guenter@zamia.org><br/>
Kaldi 5.1 adaptation contributed by mariasmo https://github.com/mariasmo<br/>
Kaldi GMM model support contributed by David Zurow https://github.com/daanzu<br/>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py-kaldi-asr-0.5.2.tar.gz (220.2 kB view details)

Uploaded Source

File details

Details for the file py-kaldi-asr-0.5.2.tar.gz.

File metadata

  • Download URL: py-kaldi-asr-0.5.2.tar.gz
  • Upload date:
  • Size: 220.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.6.0 setuptools/2.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/2.7.5

File hashes

Hashes for py-kaldi-asr-0.5.2.tar.gz
Algorithm Hash digest
SHA256 481c401c51ff48797a08e7eeec9ed3145a8600dc1b4699e41e26057794c49680
MD5 2676f7c6832a71222c1383cc38e0b96f
BLAKE2b-256 b5bb0082185cfa67d63068358de9b1a65661c8dcdbe97e935809427d5881ea17

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page