Skip to main content

a simple neural forced aligner for phoneme to audio alignment

Project description

snfa

snfa (Simple Neural Forced Aligner) is a phoneme-to-audio forced aligner built for embedded usage in python programs, with its only inference dependency being numpy and python 3.7 or later.

  • Tiny model size (2 MB)
  • Numpy as the only dependency
  • MFA comparable alignment quality

Note: You still need PyTorch and some other libs if you want to do training.

Inference

pip install snfa

Download the pretrained cv_jp.bin file from release.

cv_jp.bin is weight file trained with Japanese Common Voice Corpus 14.0, 6/28/2023, the model weight is released into Public Domain.

import snfa

aligner = snfa.Aligner("cv_jp.bin")
transcript = "k o N n i ch i w a".split(" ")

# you can also use `scipy` or `wavfile` as long as you normalize it to [-1,1]
x, _ = librosa.load("sample.wav", sr=aligner.sr)

segment, path, trellis, labels = aligner(x, transcript)

print(segment)

Training

I'll cover this part if it's needed by anyone. Please let me know by creating an issue if you need.

Todos

  • Rust crate
  • multi-language
  • Storing pau index in binary model
  • Record and warn the user when score is too low

Licence

snfa is released under ISC Licence, as shown here.

The file snfa/stft.py contains code adapted from librosa which obeys ISC Licence with different copyright claim. A copy of librosa's licence can be found in librosa's repo.

The file snfa/viterbi.py contains code adapted from torchaudio which obeys BSD 2-Clause "Simplified" License. A copy of torchaudio's licence can be found in torchaudio's repo.

Credit

The neural network used in snfa is basically a PyTorch implementation of CTC* structure described in Evaluating Speech—Phoneme Alignment and Its Impact on Neural Text-To-Speech Synthesis.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snfa-0.0.1.tar.gz (9.3 kB view hashes)

Uploaded Source

Built Distribution

snfa-0.0.1-py3-none-any.whl (9.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page