Skip to main content

"Utilities for the python package 'param'"

Project description

Documentation Status Build status License

pydrobert-speech

This pure-python library allows for flexible computation of speech features.

For example, given feature configuration called fbanks.json:

{
  "name": "stft",
  "bank": "fbank",
  "frame_length_ms": 25,
  "include_energy": true,
  "pad_to_nearest_power_of_two": true,
  "window_function": "hanning",
  "use_power": true
}

You can compute triangular, overlapping filters like Kaldi or HTK with the commands

import json
from pydrobert.speech import *
# get the feature computer ready
params = json.load(open('fbank.json'))
computer = util.alias_factory_subclass_from_arg(compute.FrameComputer, params)
# assume "signal" is a numpy float array
feats = computer.compute_full(signal)

If you plan on using a PyTorch DataLoader or Kaldi tables in your ASR pipeline, you can compute all a corpus' features by using the commands signals-to-torch-feat-dir (requires pytorch package) or compute-feats-from-kaldi-tables (requires pydrobert-kaldi package).

This package can compute much more than f-banks, with many different permutations. Consult the documentation for a more in-depth discussion of how to use it.

Documentation

Installation

pydrobert-speech is available via both PyPI and Conda.

conda install -c sdrobert pydrobert-speech
pip install pydrobert-speech
pip install git+https://github.com/sdrobert/pydrobert-speech # bleeding edge

Licensing and How to Cite

Please see the pydrobert page for more details on how to cite this package.

util.read_signal can read NIST SPHERE files. To do so, code was adapted from NIST sph2pipe program and put into pydrobert.speech._sphere. License information can be found in LICENSE_sph2pipe. Please note that the license only permits the use of their code to decode the "shorten" file type, not encode it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydrobert-speech-0.2.0.tar.gz (636.9 kB view hashes)

Uploaded Source

Built Distribution

pydrobert_speech-0.2.0-py3-none-any.whl (68.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page