Skip to main content

a simple neural forced aligner for phoneme to audio alignment

Project description

snfa

snfa (Simple Neural Forced Aligner) is a phoneme-to-audio forced aligner built for embedded usage in python programs, with its only inference dependency being numpy and python 3.7 or later.

  • Tiny model size (2 MB)
  • Numpy as the only dependency
  • MFA comparable alignment quality

Note: You still need PyTorch and some other libs if you want to do training.

Inference

pip install snfa

Download the pretrained cv_jp.bin file from release.

cv_jp.bin is weight file trained with Japanese Common Voice Corpus 14.0, 6/28/2023, the model weight is released into Public Domain.

import snfa

aligner = snfa.Aligner("cv_jp.bin")
transcript = "k o N n i ch i w a".split(" ")

# you can also use `scipy` or `wavfile` as long as you normalize it to [-1,1]
x, _ = librosa.load("sample.wav", sr=aligner.sr)

segment, path, trellis, labels = aligner(x, transcript)

print(segment)

Training

I'll cover this part if it's needed by anyone. Please let me know by creating an issue if you need.

Todos

  • Rust crate
  • multi-language
  • Storing pau index in binary model
  • Record and warn the user when score is too low

Licence

snfa is released under ISC Licence, as shown here.

The file snfa/stft.py contains code adapted from librosa which obeys ISC Licence with different copyright claim. A copy of librosa's licence can be found in librosa's repo.

The file snfa/viterbi.py contains code adapted from torchaudio which obeys BSD 2-Clause "Simplified" License. A copy of torchaudio's licence can be found in torchaudio's repo.

Credit

The neural network used in snfa is basically a PyTorch implementation of CTC* structure described in Evaluating Speech—Phoneme Alignment and Its Impact on Neural Text-To-Speech Synthesis.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snfa-0.0.1.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

snfa-0.0.1-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file snfa-0.0.1.tar.gz.

File metadata

  • Download URL: snfa-0.0.1.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for snfa-0.0.1.tar.gz
Algorithm Hash digest
SHA256 0e150246289014bea8e7c9bf9c473eb1e105dd62996810e7ddf65ee536acc47c
MD5 48eae74523bd05d4892870619b18fdc3
BLAKE2b-256 5c575f3119aef0dc80cca00c5aebb3032df5e0c588b37f65358d427683d70e3f

See more details on using hashes here.

File details

Details for the file snfa-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: snfa-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for snfa-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 efaae97af477fb6da070be5397469e21d9519d83ed296f36dce1f612cf7ee114
MD5 e04dea998345bef87b080e31316f38de
BLAKE2b-256 d59e642053287d4956a98ddd25a890fadd592fff6af3335d6d3127ad751f5e69

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page