Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus
Project description
KFA
A fast Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus.
- Built-in Speech Enhancement
- Word-level Alignment
pip install kfa
CLI
[!Note]
audio.wav
Input audio sample rate should be in 16kHz. Use ffmpeg or any other tools to resample the audio before processing.
ffmpeg -i audio_orig.wav -ac 1 -ar 16000 audio.wav
kfa -a audio.wav -t text.txt -o alignments.jsonl
# Output as Whisper style JSON format
kfa -a audio.wav -t text.txt --format whisper -o alignments.json
Python
from kfa import align, create_session
import librosa
with open("test.txt") as infile:
text = infile.read()
y, sr = librosa.load("text.wav", sr=16000, mono=True)
session = create_session()
for alignment in align(y, sr, text, session=session):
print(alignment)
References
- MMS: Scaling Speech Technology to 1000+ languages
- CTC FORCED ALIGNMENT API TUTORIAL
- Phonetisaurus
- Fine-Tune Wav2Vec2 for English ASR with 🤗 Transformers
- Thai Wav2vec2 model to ONNX model
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kfa-0.2.0.tar.gz
(10.8 MB
view details)
Built Distribution
kfa-0.2.0-py3-none-any.whl
(10.9 MB
view details)
File details
Details for the file kfa-0.2.0.tar.gz
.
File metadata
- Download URL: kfa-0.2.0.tar.gz
- Upload date:
- Size: 10.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c85fd577bbda4f9eb0653e8a8c64d62cebd046ef6097a6ca69eb4d9f237c0f9d |
|
MD5 | 40cac859cca350e648d403b5c9f40904 |
|
BLAKE2b-256 | 64a598832f08c0b3000c819e18bfb0cf4eb6959c28df3247e1f829d2bc20478a |
File details
Details for the file kfa-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: kfa-0.2.0-py3-none-any.whl
- Upload date:
- Size: 10.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c06eea8e84140f7e2c0a4c026db5a84706eaf94890ac3840aded02c5477e771d |
|
MD5 | ffa18ce1d2bceb8155b76be794a6e2f4 |
|
BLAKE2b-256 | 3864c640f39466f4c27e03579258dba41bb610232e01703291704789492a65ba |