A small collection of tools related to mispronunciation detection and diagnosis (MDD) systems.
Project description
cacoepy
cacoepy is a small collection of tools related to mispronunciation detection and diagnosis (MDD) systems.
Installation
Use the package manager pip to install cacopey.
pip install cacoepy
Usage
AlignARPAbet2
The AlignARPAbet2
class is used to align two sequences of ARPAbet phonemes, taking into account phoneme similarities. Typically sequence aligners focus on identifying matches and mismatches. However, for a more realistic alignment of phonemes in mispronounced speech versus the intended phonemes, it is important to consider the similarity between phoneme pairs.
When creating the instance, specify a gap penalty. A more negative value discourages the insertion of gaps.
from cacoepy.aligner import AlignARPAbet2
from cacoepy.core.utils import pretty_sequences
aligner = AlignARPAbet2(gap_penalty=-4)
target_phonemes = "th er m aa m ah t er".split(" ")
mispronounced_phonemes = "uw ao m eh d er".split(" ")
aligned_mispronounced, aligned_target, score = aligner(mispronounced_phonemes, target_phonemes)
pretty_sequences(aligned_target, aligned_mispronounced)
Output:
th er m aa m ah t er
- uw - ao m eh d er
In this example, many of the phonemes are substituted or deleted in this child’s transcription of “thermometer.” Despite this, the AlignARPAbet2
has found a good alignment by factoring in the similarities between pairs such as er and uw. For comparison, the Python package Levenshtein editops
alignment of the same sequences was:
th er m aa m ah t er
uw ao m eh d - - er
Where it only aligns based on exact matches. Further implementation details can be found here.
align_prediction_to_annotation_and_target
Given three sets of phoneme sequences:
target
: The phonemes the speaker is attempting to say.annotation
: The annotation of how the speaker pronounced the target.prediction
: The output of an MDD system predicting what the speaker said.
This function aligns the prediction sequence to the annotation sequence while preserving the existing alignment between annotation and target.
from cacoepy.aligner import align_prediction_to_annotation_and_target
from cacoepy.core.utils import pretty_sequences
target = "th er m aa m ah t er".split(" ")
annotation = "- uw - ao m eh d er".split(" ")
prediction = "uw aa ao m eh d uh er".split(" ")
aligned_pred, aligned_ann, aligned_tar = align_prediction_to_annotation_and_target(
target_aligned_with_annotation=target,
annotation_aligned_with_target=annotation,
prediction=prediction,
)
pretty_sequences(aligned_tar, aligned_ann, aligned_pred)
Output
th er m aa m ah t - er
- uw - ao m eh d - er
- uw aa ao m eh d uh er
Future Features
AlignARPAbet3
- Aligns 3 sets of phonemes.mdd_phoneme_metrics
- Evaluation metrics for MDD systems.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cacoepy-1.1.0.tar.gz
.
File metadata
- Download URL: cacoepy-1.1.0.tar.gz
- Upload date:
- Size: 11.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ccc4159cf59d45266ff0a3fecb3276b68134c02d5aa71a9390a6da9f93936fcf |
|
MD5 | 93fb494c75f5d76b8af17e93d3f87949 |
|
BLAKE2b-256 | a1b4faa3e0b262462e0ecee24600a1521eb8c14de167d41ac36dfd9127b97807 |
File details
Details for the file cacoepy-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: cacoepy-1.1.0-py3-none-any.whl
- Upload date:
- Size: 11.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f95cb83a773094386905fc308d0f6a974f784ac5d624d80aeb3b51201d54a138 |
|
MD5 | b3a39ef68fbc9b534569212da802c608 |
|
BLAKE2b-256 | 80428cf242dc399491934b84529d925ef2edfc39fac1c284610dae4364577b09 |