"Utilities for the python package 'param'"
Project description
pydrobert-speech
This pure-python library allows for flexible computation of speech features.
For example, given feature configuration called fbanks.json
:
{ "name": "stft", "bank": "fbank", "frame_length_ms": 25, "include_energy": true, "pad_to_nearest_power_of_two": true, "window_function": "hanning", "use_power": true }
You can compute triangular, overlapping filters like Kaldi or HTK with the commands
import json from pydrobert.speech import * # get the feature computer ready params = json.load(open('fbank.json')) computer = util.alias_factory_subclass_from_arg(compute.FrameComputer, params) # assume "signal" is a numpy float array feats = computer.compute_full(signal)
If you plan on using a PyTorch DataLoader
or Kaldi
tables in your ASR pipeline, you can compute all a corpus' features by
using the commands signals-to-torch-feat-dir
(requires pytorch package)
or compute-feats-from-kaldi-tables
(requires pydrobert-kaldi package).
This package can compute much more than f-banks, with many different permutations. Consult the documentation for a more in-depth discussion of how to use it.
Documentation
Installation
pydrobert-speech is available via both PyPI and Conda.
conda install -c sdrobert pydrobert-speech
pip install pydrobert-speech
pip install git+https://github.com/sdrobert/pydrobert-speech # bleeding edge
Licensing and How to Cite
Please see the pydrobert page for more details on how to cite this package.
util.read_signal
can read NIST SPHERE files. To do so, code was adapted from
NIST sph2pipe
program
and put into pydrobert.speech._sphere
. License information can be found in
LICENSE_sph2pipe
. Please note that the license only permits the use of their
code to decode the "shorten" file type, not encode it.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pydrobert_speech-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 264acc35b7e9d77fcb897f0f751ff8a02616ca88bdc72e4a8e400e007b695824 |
|
MD5 | 49ced82bb853b3afa74600ae182da6bb |
|
BLAKE2-256 | de229ae5364322fdea1c4ddbc2dbbe137a0e11fbffaa7b8b97630f1863eb3519 |