Speech processing with Python
Project description
pydrobert-speech
This pure-python library allows for flexible computation of speech features.
For example, given feature configuration called fbanks.json
:
{
"name": "stft",
"bank": "fbank",
"frame_length_ms": 25,
"include_energy": true,
"pad_to_nearest_power_of_two": true,
"window_function": "hanning",
"use_power": true
}
You can compute triangular, overlapping filters like Kaldi or HTK with the commands
import json
from pydrobert.speech import *
# get the feature computer ready
params = json.load(open('fbank.json'))
computer = util.alias_factory_subclass_from_arg(compute.FrameComputer, params)
# assume "signal" is a numpy float array
feats = computer.compute_full(signal)
If you plan on using a PyTorch DataLoader
or Kaldi
tables in your ASR pipeline, you can compute all a corpus' features by
using the commmands signals-to-torch-feat-dir
(requires pytorch package)
or compute-feats-from-kaldi-tables
(requires pydrobert-kaldi package).
This package can compute much more than f-banks, with many different permutations. Consult the documentation for a more in-depth discussion of how to use it.
Documentation
Installation
pydrobert-speech is available via both PyPI and Conda.
conda install -c sdrobert pydrobert-speech
pip install pydrobert-speech
pip install git+https://github.com/sdrobert/pydrobert-speech # bleeding edge
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pydrobert_speech-0.0.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 09e54ef6369f601ca294f03391951bea32ab23e218b1c43ee07fea4151a99059 |
|
MD5 | 8c66cc71c0a8675902f967357e5871b7 |
|
BLAKE2b-256 | bcf9965a44d236972f617a9f0e65b41e38f1083edd0e2445e5b5bd5aca9fa79f |