a simple neural forced aligner for phoneme to audio alignment

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: ISC License (ISCL)
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

snfa

snfa (Simple Neural Forced Aligner) is a phoneme-to-audio forced aligner built for embedded usage in python programs, with its only inference dependency being numpy and python 3.7 or later.

Tiny model size (2 MB)
Numpy as the only dependency
MFA comparable alignment quality

Note: You still need PyTorch and some other libs if you want to do training.

Inference

pip install snfa

A pre-trained model weight jp.npz is included.

jp.npz is a weight file trained on Japanese Common Voice Corpus 14.0, 6/28/2023. The model weight is released into Public Domain.

import snfa
import librosa # or soundfile, torchaudio, scipy, etc.


aligner = snfa.Aligner() # use custom model by passing its path to this function
# NOTE: the default model is uncased, it doesn't make difference between `U` and `u`
transcript = "k o N n i ch i w a".lower().split(" ") # remember to lower it here

# you can also use `scipy` or `wavfile` as long as it's 
# 1. mono channel numpy array with shape (T,), dtype=np.float32
# 2. normalized to [-1,1]
# 3. sample rate matches model's `sr`
x, sr = librosa.load("sample.wav", sr=aligner.sr)
# trim the audio for better performance
x, _ = librosa.effects.trim(x, top_db=20)
# we also provide a utility function to trim
# it's basically ripped off from librosa so you don't have to install it
x, _ = snfa.trim_audio(x, top_db=20)

segments = aligner(x, transcript)

print(segment)
# (phoneme label, start mili-sec, end mili-sec, score)
# [('pau', 0, 908, 0.9583546351318474),
#  ('k', 908, 928, 0.006900709283433312),
#  ('o', 928, 1088, 0.795996002234283),
# ...]

# NOTE: The timestamps are in mili-sec, you can convert them to the indices on wavform by
wav_index = int(timestamp * aligner.sr / 1000)

Development

We use uv to manage dependencies.

The following command will install them.

uv sync

Training

Download Common Voice Dataset and extract it somewhere.

We use the split from whole validated.tsv, while filtered out the dev and test split.

Filter the dataset:

uv run filter_dataset.py -d /path/to/common/voice/

Start training:

uv run -c config.yaml -d /path/to/common/voice/

Checkpoints will be saved to logs/lightning_logs/

Be noted that the -d should point to where the *.tsvs are. In Japanese CV dataset, it's sub directory ja.

Bundle

When bundling app with pyinstaller, add

from PyInstaller.utils.hooks import collect_data_files

data = collect_data_files('snfa')

# consume the data in Analyzer

To bundle the model weights properly. I'd appreciate it if you offer a better way.

Todos

Rust crate
multi-language

Licence

snfa is released under ISC Licence, as shown here.

The file snfa/stft.py and snfa/util.py contains code adapted from librosa which obeys ISC Licence with different copyright claim. A copy of librosa's licence can be found in librosa's repo.

The file snfa/viterbi.py contains code adapted from torchaudio which obeys BSD 2-Clause "Simplified" License. A copy of torchaudio's licence can be found in torchaudio's repo.

The testing audio file is ripped

Credit

The neural network used in snfa is basically a PyTorch implementation of CTC* structure described in Evaluating Speech—Phoneme Alignment and Its Impact on Neural Text-To-Speech Synthesis.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: ISC License (ISCL)
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.3.1 yanked

Nov 12, 2025

Reason this release was yanked:

Not better than 0.3.0 yet slower

0.3.0

Aug 25, 2025

0.2.1

Jul 22, 2025

This version

0.2.0

Jul 22, 2025

0.1.0

Jul 19, 2025

0.0.1

Aug 13, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snfa-0.2.0.tar.gz (1.2 MB view details)

Uploaded Jul 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

snfa-0.2.0-py3-none-any.whl (1.2 MB view details)

Uploaded Jul 22, 2025 Python 3

File details

Details for the file snfa-0.2.0.tar.gz.

File metadata

Download URL: snfa-0.2.0.tar.gz
Upload date: Jul 22, 2025
Size: 1.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.19

File hashes

Hashes for snfa-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`2be48956b50737381666640bd52750189867f17cb325f3d334f6f95bb6bb9b3b`
MD5	`adb29afc92924e4ea0fbff219d091912`
BLAKE2b-256	`dc14e0c20f1b2e78fa9ffcb357f802e795fc9a43bd4ef291e33461b4785b04cb`

See more details on using hashes here.

File details

Details for the file snfa-0.2.0-py3-none-any.whl.

File metadata

Download URL: snfa-0.2.0-py3-none-any.whl
Upload date: Jul 22, 2025
Size: 1.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.19

File hashes

Hashes for snfa-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`16dec139c58ecd31e04edec7cc9d700e1d38fc2d04bc575068f4e8b7bbfa2a9d`
MD5	`a204250ca41cd7a7f39be15c6c06ef81`
BLAKE2b-256	`2722448ed4ba8b585116f01fac0b1257493f578ea27fcccdf22b337d45b03570`

See more details on using hashes here.

snfa 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

snfa

Inference

Development

Training

Bundle

Todos

Licence

Credit

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes