Skip to main content

"Utilities for the python package 'param'"

Project description

pydrobert-speech

Documentation Status Build status License

This pure-python library allows for flexible computation of speech features.

For example, given feature configuration called fbanks.json:

{
  "name": "stft",
  "bank": "fbank",
  "frame_length_ms": 25,
  "include_energy": true,
  "pad_to_nearest_power_of_two": true,
  "window_function": "hanning",
  "use_power": true
}

You can compute triangular, overlapping filters like Kaldi or HTK with the commands

import json
from pydrobert.speech import *
# get the feature computer ready
params = json.load(open('fbank.json'))
computer = util.alias_factory_subclass_from_arg(compute.FrameComputer, params)
# assume "signal" is a numpy float array
feats = computer.compute_full(signal)

If you plan on using a PyTorch DataLoader or Kaldi tables in your ASR pipeline, you can compute all a corpus' features by using the commands signals-to-torch-feat-dir (requires pytorch package) or compute-feats-from-kaldi-tables (requires pydrobert-kaldi package).

This package can compute much more than f-banks, with many different permutations. Consult the documentation for a more in-depth discussion of how to use it.

Documentation

Installation

pydrobert-speech is available via both PyPI and Conda.

conda install -c sdrobert pydrobert-speech
pip install pydrobert-speech
pip install git+https://github.com/sdrobert/pydrobert-speech # bleeding edge

Licensing and How to Cite

Please see the pydrobert page for more details on how to cite this package.

util.read_signal can read NIST SPHERE files. To do so, code was adapted from NIST sph2pipe program and put into pydrobert.speech._sphere. License information can be found in LICENSE_sph2pipe. Please note that the license only permits the use of their code to decode the "shorten" file type, not encode it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydrobert-speech-0.3.0.tar.gz (649.3 kB view details)

Uploaded Source

Built Distribution

pydrobert_speech-0.3.0-py3-none-any.whl (71.9 kB view details)

Uploaded Python 3

File details

Details for the file pydrobert-speech-0.3.0.tar.gz.

File metadata

  • Download URL: pydrobert-speech-0.3.0.tar.gz
  • Upload date:
  • Size: 649.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.7.13

File hashes

Hashes for pydrobert-speech-0.3.0.tar.gz
Algorithm Hash digest
SHA256 c1fc5cfa7a61e1ee392d8a17594236a4d8fe2b9df4babef51558f1bf7ace748d
MD5 98ddf1931ee643b82c6f4f91fd4444b4
BLAKE2b-256 1feeee1c9792d34c5d757c2c15c1ecaca24800baf791dbf1335eca8daa25aca3

See more details on using hashes here.

File details

Details for the file pydrobert_speech-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pydrobert_speech-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e5f43b169d9a73699c8819108ebce028d718947eaaf1ea36cc11755c92bb95c6
MD5 7ea84bdd1b115f0d5cbb68fb9e78a233
BLAKE2b-256 687e3a30633e0ca3575594a3557b7dbd5cfcf84f289f057c5b0cbb9fb457ae0e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page