Python module for speech signal processing
Project description
diffsptk
diffsptk is a differentiable version of SPTK based on the PyTorch framework.
Requirements
- Python 3.8+
- PyTorch 1.9.0+
Documentation
See this page for a reference manual.
Installation
The latest stable release can be installed through PyPI by running
pip install diffsptk
Alternatively,
git clone https://github.com/sp-nitech/diffsptk.git
pip install -e diffsptk
Examples
Mel-cepstral analysis
import diffsptk
import torch
x = torch.randn(100)
# Compute STFT of x.
stft = diffsptk.STFT(frame_length=12, frame_period=10, fft_length=16)
X = stft(x)
# Estimate 4-th order mel-cepstrum of x.
mcep = diffsptk.MelCepstralAnalysis(cep_order=4, fft_length=16, alpha=0.1, n_iter=1)
mc = mcep(X)
Mel-spectrogram extraction
import diffsptk
import torch
x = torch.randn(100)
# Compute STFT of x.
stft = diffsptk.STFT(frame_length=12, frame_period=10, fft_length=32)
X = stft(x)
# Apply 4 mel-filter banks to the STFT.
fbank = diffsptk.MelFilterBankAnalysis(n_channel=4, fft_length=32, sample_rate=8000)
Y = fbank(X)
Subband decomposition
import diffsptk
import torch
K = 4 # Number of subbands.
M = 40 # Order of filter.
x = torch.randn(100)
# Decompose x.
pqmf = diffsptk.PQMF(K, M)
decimate = diffsptk.Decimation(K)
y = decimate(pqmf(x), dim=-1)
# Reconstruct x.
interpolate = diffsptk.Interpolation(K)
ipqmf = diffsptk.IPQMF(K, M)
x_hat = ipqmf(interpolate(K * y, dim=-1))
# Compute error between two signals.
error = torch.abs(x_hat - x).sum()
Status
module will not be implemented in this repository.
- acorr
-
acr2csm -
aeq(torch.allclose) -
amgcep -
average(torch.mean) - b2mc
-
bcp -
bcut - c2acr
- c2mpir
- c2ndps
- cdist
-
clip(torch.clip) -
csm2acr - dct
- decimate
-
delay - delta
- dequantize
- df2
- dfs
-
dmp - dtw
-
dtw_merge -
entropy(torch.special.entr) - excite
-
extract - fbank
-
fd -
fdrw -
fft(torch.fft.fft) -
fft2(torch.fft.fft2) - fftcep
-
fftr(torch.fft.rfft) -
fftr2(torch.fft.rfft2) - frame
- freqt
-
glogsp -
gmm -
gmmp - gnorm
-
gpolezero -
grlogsp - grpdelay
-
gseries -
gspecgram -
gwave -
histogram(torch.histogram) -
huffman -
huffman_decode -
huffman_encode - idct
-
ifft(torch.fft.ifft) -
ifft2(torch.fft.ifft2) - ignorm
- imglsadf
- impulse
- imsvq
- interpolate
- ipqmf
- iulaw
-
lar2par -
lbg - levdur
- linear_intpl
- lpc
-
lpc2c -
lpc2lsp - lpc2par
- lpccheck
-
lsp2lpc -
lspcheck -
lspdf - ltcdf
- mc2b
- mcep
- mcpf
-
median(torch.median) -
merge - mfcc
- mgc2mgc
- mgc2sp
- mgcep
- mglsadf
-
mglsp2sp -
minmax - mlpg (support only unit variance)
- mlsacheck
- mpir2c
- mseq
- msvq
-
nan(torch.isnan) - ndps2c
- norm0
-
nrand(torch.randn) -
par2lar - par2lpc
- pca
- pcas
- phase
- pitch
-
pitch_mark - poledf
- pqmf
- quantize
- ramp
-
reverse - rlevdur
- rmse
-
root_pol - sin
- smcep
- snr
-
sopr - spec
- step
-
swab -
symmetrize - train
-
transpose(torch.transpose) - ulaw
-
vc -
vopr -
vstat(torch.var_mean) -
vsum(torch.sum) - window
-
x2x - zcross
- zerodf
License
This software is released under the Apache License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
diffsptk-0.2.0-py3-none-any.whl
(70.5 kB
view hashes)