Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus
Project description
KFA
A fast Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus.
- Built-in Speech Enhancement
- Word-level Alignment
pip install kfa
CLI
kfa -a audio.wav -t text.txt -o alignments.jsonl
Python
from kfa import align
import librosa
with open("test.txt") as infile:
text = infile.read()
y, sr = librosa.load("text.wav", sr=16000, mono=True)
for alignment in align(y, sr, text):
print(alignment)
References
- MMS: Scaling Speech Technology to 1000+ languages
- CTC FORCED ALIGNMENT API TUTORIAL
- Phonetisaurus
- Fine-Tune Wav2Vec2 for English ASR with 🤗 Transformers
- Thai Wav2vec2 model to ONNX model
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kfa-0.0.7.tar.gz
(10.8 MB
view hashes)
Built Distribution
kfa-0.0.7-py3-none-any.whl
(10.9 MB
view hashes)