"Utilities for the python package 'param'"
Project description
pydrobert-speech
This pure-python library allows for flexible computation of speech features.
For example, given feature configuration called fbanks.json
:
{
"name": "stft",
"bank": "fbank",
"frame_length_ms": 25,
"include_energy": true,
"pad_to_nearest_power_of_two": true,
"window_function": "hanning",
"use_power": true
}
You can compute triangular, overlapping filters like Kaldi or HTK with the commands
import json
from pydrobert.speech import *
# get the feature computer ready
params = json.load(open('fbank.json'))
computer = util.alias_factory_subclass_from_arg(compute.FrameComputer, params)
# assume "signal" is a numpy float array
feats = computer.compute_full(signal)
If you plan on using a PyTorch DataLoader
or Kaldi
tables in your ASR pipeline, you can compute all a corpus' features by
using the commands signals-to-torch-feat-dir
(requires pytorch package)
or compute-feats-from-kaldi-tables
(requires pydrobert-kaldi package).
This package can compute much more than f-banks, with many different permutations. Consult the documentation for a more in-depth discussion of how to use it.
Documentation
Installation
pydrobert-speech is available via both PyPI and Conda.
conda install -c sdrobert pydrobert-speech
pip install pydrobert-speech
pip install git+https://github.com/sdrobert/pydrobert-speech # bleeding edge
Licensing and How to Cite
Please see the pydrobert page for more details on how to cite this package.
util.read_signal
can read NIST SPHERE files. To do so, code was adapted from
NIST sph2pipe
program
and put into pydrobert.speech._sphere
. License information can be found in
LICENSE_sph2pipe
. Please note that the license only permits the use of their
code to decode the "shorten" file type, not encode it.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pydrobert-speech-0.3.0.tar.gz
.
File metadata
- Download URL: pydrobert-speech-0.3.0.tar.gz
- Upload date:
- Size: 649.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.7.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1fc5cfa7a61e1ee392d8a17594236a4d8fe2b9df4babef51558f1bf7ace748d |
|
MD5 | 98ddf1931ee643b82c6f4f91fd4444b4 |
|
BLAKE2b-256 | 1feeee1c9792d34c5d757c2c15c1ecaca24800baf791dbf1335eca8daa25aca3 |
File details
Details for the file pydrobert_speech-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: pydrobert_speech-0.3.0-py3-none-any.whl
- Upload date:
- Size: 71.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.7.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5f43b169d9a73699c8819108ebce028d718947eaaf1ea36cc11755c92bb95c6 |
|
MD5 | 7ea84bdd1b115f0d5cbb68fb9e78a233 |
|
BLAKE2b-256 | 687e3a30633e0ca3575594a3557b7dbd5cfcf84f289f057c5b0cbb9fb457ae0e |