a simple neural forced aligner for phoneme to audio alignment
Project description
snfa
snfa
(Simple Neural Forced Aligner) is a phoneme-to-audio forced aligner built for embedded usage in python programs, with its only inference dependency being numpy
and python 3.7 or later.
- Tiny model size (2 MB)
- Numpy as the only dependency
- MFA comparable alignment quality
Note: You still need PyTorch
and some other libs if you want to do training.
Inference
pip install snfa
Download the pretrained cv_jp.bin
file from release.
cv_jp.bin
is weight file trained with Japanese Common Voice Corpus 14.0, 6/28/2023, the model weight is released into Public Domain
.
import snfa
aligner = snfa.Aligner("cv_jp.bin")
transcript = "k o N n i ch i w a".split(" ")
# you can also use `scipy` or `wavfile` as long as you normalize it to [-1,1]
x, _ = librosa.load("sample.wav", sr=aligner.sr)
segment, path, trellis, labels = aligner(x, transcript)
print(segment)
Training
I'll cover this part if it's needed by anyone. Please let me know by creating an issue if you need.
Todos
- Rust crate
- multi-language
- Storing
pau
index in binary model - Record and warn the user when score is too low
Licence
snfa
is released under ISC Licence
, as shown here.
The file snfa/stft.py
contains code adapted from librosa
which obeys ISC Licence
with different copyright claim. A copy of librosa
's licence can be found in librosa's repo.
The file snfa/viterbi.py
contains code adapted from torchaudio
which obeys BSD 2-Clause "Simplified" License
. A copy of torchaudio
's licence can be found in torchaudio's repo.
Credit
The neural network used in snfa
is basically a PyTorch implementation of CTC*
structure described in Evaluating Speech—Phoneme Alignment and Its Impact on Neural Text-To-Speech Synthesis.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file snfa-0.0.1.tar.gz
.
File metadata
- Download URL: snfa-0.0.1.tar.gz
- Upload date:
- Size: 9.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e150246289014bea8e7c9bf9c473eb1e105dd62996810e7ddf65ee536acc47c |
|
MD5 | 48eae74523bd05d4892870619b18fdc3 |
|
BLAKE2b-256 | 5c575f3119aef0dc80cca00c5aebb3032df5e0c588b37f65358d427683d70e3f |
File details
Details for the file snfa-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: snfa-0.0.1-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | efaae97af477fb6da070be5397469e21d9519d83ed296f36dce1f612cf7ee114 |
|
MD5 | e04dea998345bef87b080e31316f38de |
|
BLAKE2b-256 | d59e642053287d4956a98ddd25a890fadd592fff6af3335d6d3127ad751f5e69 |