A Python module for aligning mispronounced phonemes.
Project description
cacoepy
cacoepy is a small collection of tools related to mispronunciation detection and diagnosis (MDD) systems.
Installation
Download this repository and then run:
pip install .
Phoneme Alignment
The AlignARPAbet2
class is used to align two sequences of ARPAbet phonemes, taking into account phoneme similarities. Typically sequence aligners focus on identifying matches and mismatches. However, for a more realistic alignment of phonemes in mispronounced speech versus the intended phonemes, it is important to consider the similarity between phoneme pairs.
Usage
When creating the instance, specify a gap penalty. A more negative value discourages the insertion of gaps.
from cacoepy.aligner import AlignARPAbet2
aligner = AlignARPAbet2(gap_penalty=-4)
target_phonemes = "th er m aa m ah t er".split(" ")
mispronounced_phonemes = "uw ao m eh d er".split(" ")
aligned_mispronounced, aligned_target, score = aligner(mispronounced_phonemes, target_phonemes)
Resulting Alignment:
th er m aa m ah t er
- uw - ao m eh d er
In this example, many of the phonemes are substituted or deleted in this child’s transcription of “thermometer.” Despite this, the AlignARPAbet2
has found a good alignment by factoring in the similarities between pairs such as er and uw. For comparison, the Python package Levenshtein editops
alignment of the same sequences was:
th er m aa m ah t er
uw ao m eh d - - er
Where it only aligns based on exact matches.
Implementation
The AlignARPAbet2
uses the Needleman-Wunsch algorithm with a custom similarity matrix for assigning scores to phoneme pairs. To generate the similarity matrix, the phonemes are broken down into their 35 attributes, which describe how they are articulated. Each phoneme may have several attributes each (see data/ARPAbet_mapping.json
for the breakdown). By signifying which attributes are present or not, each phoneme is represented as a vector in a 35-dimensional attribute space. Then, the cosine similarity is calculated between each pair of phoneme vectors and placed into a lookup table to be used to inform the Needleman-Wunsch algorithm during alignment.
A visual representation of the similarity matrix is shown below. The clear separation of consonants and vowels is apparent in the sub-squares.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cacoepy-1.0.0.tar.gz
.
File metadata
- Download URL: cacoepy-1.0.0.tar.gz
- Upload date:
- Size: 9.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a1d76e420901e4b5f0f40010c27d4c6db73edec0e3ee392e8776c486d118086a |
|
MD5 | be3f68eb6e61e292fdf580c6db300add |
|
BLAKE2b-256 | 7c3a465052ebe37033baa479e0cd11215dcb7ad21addcb5ea99b17159f1527ed |
File details
Details for the file cacoepy-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: cacoepy-1.0.0-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 984d467c36b9cfb1c781eaa3227db82c9addf0dec4a9c1e118892678c3f4a92b |
|
MD5 | 80f50d888b42686c5435a99f7be9c708 |
|
BLAKE2b-256 | a1b894c246a5c9c44460533c6b87c7e331181432d1a6d743ed89cd9b0524a3f8 |