Skip to main content

Python wrapper for Kaldi

Project description

Build status Documentation Status License

pydrobert-kaldi

Some Kaldi bindings for Python. I started this project because I wanted to seamlessly incorporate Kaldi's I/O mechanism into the gamut of Python-based data science packages (e.g. Theano, Tensorflow, CNTK, PyTorch, etc.). The code base is expanding to wrap more of Kaldi's feature processing and mathematical functions, but is unlikely to include modelling or decoding.

Eventually, I plan on adding hooks for Kaldi audio features and pre-/post- processing. However, I have no plans on porting any code involving modelling or decoding.

This is student-driven code, so don't expect a stable API. I'll try to use semantic versioning, but the best way to keep functionality stable is by forking.

Documentation

Input/Output

Most I/O can be performed with the pydrobert.kaldi.io.open function:

from pydrobert.kaldi import io
with io.open('scp:foo.scp', 'bm') as f:
     for matrix in f:
         ...

open is a factory function that determines the appropriate underlying stream to open, much like Python's built-in open. The data types we can read (here, a BaseMatrix) are listed in pydrobert.kaldi.io.enums.KaldiDataType. Big data types, like matrices and vectors, are piped into Numpy arrays. Passing an extended filename (e.g. paths to files on discs, '-' for stdin/stdout, 'gzip -c a.ark.gz |', etc.) opens a stream from which data types can be read one-by-one and in the order they were written. Alternatively, prepending the extended filename with 'ark[,[option_a[,option_b...]]:' or 'scp[,...]:' and specifying a data type allows one to open a Kaldi table for iterator-like sequential reading (mode='r'), dict-like random access reading (mode='r+'), or writing (mode='w'). For more information on the open function, consult the docstring.

The submodule pydrobert.kaldi.io.corpus contains useful wrappers around Kaldi I/O to serve up batches of data to, say, a neural network:

train = ShuffledData('scp:feats.scp', 'scp:labels.scp', batch_size=10)
for feat_batch, label_batch in train:
    ...

Logging and CLI

By default, Kaldi error, warning, and critical messages are piped to standard error. The pydrobert.kaldi.logging submodule provides hooks into python's native logging interface: the logging module. The :class:KaldiLogger can handle stack traces from Kaldi C++ code, and there are a variety of decorators to finagle the kaldi logging patterns to python logging patterns, or vice versa.

You'd likely want to explicitly handle logging when creating new kaldi-style commands for command line. pydrobert.kaldi.io.argparse provides :class:KaldiParser, an :class:ArgumentParser tailored to Kaldi inputs/outputs. It is used by a few command-line entry points added by this package. See the Command-Line Interface page for details.

Installation

Prepackaged binaries of tagged versions of pydrobert-kaldi are available for most 64-bit platforms (Windows, Glibc Linux, OSX) and most active Python versions (3.7-3.11) on both conda and PyPI.

To install via conda

   conda install -c sdrobert pydrobert-kaldi

A conda-forge version is TBD.

To install via PyPI

   pip install pydrobert-kaldi

You can also try building the cutting-edge version. To do so, you'll need to first install SWIG 4.0 and an appropriate C++ compiler, then

   pip install git+https://github.com/sdrobert/pydrobert-kaldi.git

The current version does not require a BLAS install, though it likely will in the future as more is wrapped.

License

This code is licensed under Apache 2.0.

Code found under the src/ directory has been primarily copied from Kaldi. setup.py is also strongly influenced by Kaldi's build configuration. Kaldi is also covered by the Apache 2.0 license; its specific license file was copied into src/COPYING_Kaldi_Project to live among its fellows.

How to Cite

Please see the pydrobert page for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydrobert-kaldi-0.6.3.tar.gz (530.3 kB view details)

Uploaded Source

Built Distributions

pydrobert_kaldi-0.6.3-cp311-cp311-win_amd64.whl (901.2 kB view details)

Uploaded CPython 3.11 Windows x86-64

pydrobert_kaldi-0.6.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.0 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

pydrobert_kaldi-0.6.3-cp311-cp311-macosx_10_9_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

pydrobert_kaldi-0.6.3-cp310-cp310-win_amd64.whl (901.2 kB view details)

Uploaded CPython 3.10 Windows x86-64

pydrobert_kaldi-0.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.0 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

pydrobert_kaldi-0.6.3-cp310-cp310-macosx_10_9_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

pydrobert_kaldi-0.6.3-cp39-cp39-win_amd64.whl (901.5 kB view details)

Uploaded CPython 3.9 Windows x86-64

pydrobert_kaldi-0.6.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

pydrobert_kaldi-0.6.3-cp39-cp39-macosx_10_9_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

pydrobert_kaldi-0.6.3-cp38-cp38-win_amd64.whl (901.6 kB view details)

Uploaded CPython 3.8 Windows x86-64

pydrobert_kaldi-0.6.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

pydrobert_kaldi-0.6.3-cp38-cp38-macosx_10_9_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

pydrobert_kaldi-0.6.3-cp37-cp37m-win_amd64.whl (901.6 kB view details)

Uploaded CPython 3.7m Windows x86-64

pydrobert_kaldi-0.6.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

pydrobert_kaldi-0.6.3-cp37-cp37m-macosx_10_9_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file pydrobert-kaldi-0.6.3.tar.gz.

File metadata

  • Download URL: pydrobert-kaldi-0.6.3.tar.gz
  • Upload date:
  • Size: 530.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.12

File hashes

Hashes for pydrobert-kaldi-0.6.3.tar.gz
Algorithm Hash digest
SHA256 d44aefb58c96141ef705f7cb4fc692c15f348bb2fdef2a1f275589062c83c66a
MD5 332e9d4274313a0bbb116ea349100d17
BLAKE2b-256 4c4e9a6c6d7d17f056f35a6a7545f9d33abe493bb0b98297ff7d4ce6d1c00b88

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 c57470c5790a6e9b8cf55f919651435e02c939bf29b7e787d12cb9af8a1507c4
MD5 e24cd2830b5f60932b99194dc0dcc223
BLAKE2b-256 57fa03a2f31108400fb75d69131a0883d5b86cd9695161b6d9a38a91a9270629

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9810d0dea116981fff2444c75719719b086fe3ac0a3423e9e98f473fe9c2c871
MD5 1ab12c470f034de67c98ee711d088cb9
BLAKE2b-256 c9f12ac5df99158828cbadcb951ef2591d12bd184948034e6a689c52a90f2a9f

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 7ce83d9030319a52b04e5500794a5a1b751c793e18e5807fa1d1f77607f5774c
MD5 e70c8cc987c9b27da559412859181b94
BLAKE2b-256 00eee270f7c6beee982bb520268f3008be09e138c9ba13777a5c03ea5fac6bac

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 3d10fecfa3e7a6a9abe7227eefe973a1b122f96e5e6ab60a1358b978bc1f6dbe
MD5 94f79870a93236fb24d289604758c3fd
BLAKE2b-256 fbc877ab46539af7fc7c3e35e3c6737f9ff686e99842d51c559678be660cecc8

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 776b9c6b5d3a686885c2b92c4f8301e3a2d0279bc9ae684ce3c1da7c45edac5a
MD5 245e7a572ee12420cb510f7916dea29b
BLAKE2b-256 16aac3fa4123fc3327d1719d120d1b32532637dc68d9ece7b08aa0e311e5e103

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 5f0ad941cdfc0ef558455fac486cbe4e037e25c9f32445e40ce001760a2b566d
MD5 ebd2cfcd5cc667f1bdff20a3c5013ef7
BLAKE2b-256 61407c146282c73896c980a272af8c72d2b7d349a38b460b42d7320b8697ab49

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 463f5b128af90b47dee9ae119cba132f4b308926694d1b4c704d2d51a5a13821
MD5 cc07b67e6161c043aacd690318cd2df8
BLAKE2b-256 4d453e53ac9c548b6569a845a87d0f78b7f91dcd853946cc53236043966bc8f0

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1eb66a92d67f3f34eba59e9743de918b25e6c408252f52e89b18438325170de7
MD5 628bffac5956c59381d6b9a62ed0086d
BLAKE2b-256 56c4c5da50646fee537361a5ef09260c47ec648754d32219635de5b76df4c599

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ef58206962482bbcc478fb372dab4d40ac1c92197f0fbab5ef5b7a82c34b4087
MD5 94ee3877520bd6a3634bf0687fca8460
BLAKE2b-256 1f3f176b42051383774f48f3e9354f726eccc15bc4a4560348fb68f9fc4adf1a

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 aa4ff250a23adf8483296bcf3d63f060656a1c446dc87a84ea7a6fece9e01f43
MD5 2f231c429017a7242deb8c94171ae440
BLAKE2b-256 492423effc592c83f32c2f4f02f7b4c13b5aeb0623895ca20a8de9e686505c95

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 481f89c4733d44a3e1796352cea7ab43fb77fc8d656d3721d304140e9dcbc645
MD5 dc936c8b139d8f76fb277cdedd91de33
BLAKE2b-256 0c0354a8de5d8d44d0d32b99deeefde350f03675b94154fe5c416e0894793e16

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 573bf82f93eccc7140bf5b0cb33e7d20dbe3c15ee5c46580c343b5581ec2c6e3
MD5 972eeea83fdfb421fad18ab9115939a8
BLAKE2b-256 33a82879ed07adc26b45fa3a6e4330122ea719da105d8779eb49444082cc6a0e

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 24327e883d9875782aa0d636fb782936432541765b074fe0cfc0e6e3ad403498
MD5 d36c4eff89977fe8b60ed1c52fd4668d
BLAKE2b-256 76b22a0dbe8efb7da1f0c709fe85c166c6751059b2a76b9d3fa8b70b901c2060

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7ee1158e0629838fbd04588502ea660aff31c61377d96799f29d18b9c893e038
MD5 f09bc74c3b5d3a9851e674bf7c4e2cf3
BLAKE2b-256 e6e2bbf2d5ac09b00b0838c5fddd970db286c03739885d7f1ab607464fa521d9

See more details on using hashes here.

File details

Details for the file pydrobert_kaldi-0.6.3-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pydrobert_kaldi-0.6.3-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 3d584fa3a676ffae2b2c6059e26bb2576916f215468d5e310f89d58cb8e39e91
MD5 acac17d0afc90601cf07931f101a6089
BLAKE2b-256 179833f044203a10eb3652f403040ad14d43c0b204f47ed6554fe3e220b4316b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page