Statistical Analysis of Phoneme Discrimination Data
Project description
Package ConfMatrixCalc implements probabilistic Bayesian analysis of closed-set identification / discrimination tests, in which the results may be recorded in the form of confusion matrices. The package estimates probabilities of correct response for each test item, and probabilities for all allowed incorrect response alternatives for each stimulus.
The package was developed for analyzing phoneme identification test results, but it can be used to analyze results from any type of closed-set classification, performed by either humans or machines. The analysis approach was presented and validated in (Leijon et al., 2016).
Phoneme identification tests are used, for example, to evaluate the detailed ("microscopic") speech-recognition ability of listeners using two or more different hearing aids or other sound-transmission instruments or algorithms. Phoneme identification performance can be tested with real words with minimal phonemic contrasts (e.g., Fairbanks, 1958; Witte et al., 2024), or using nonsense "words" with a fixed structure, e.g., CV (Miller & Nicely, 1955), or VCV, or CVCVC, where C is a consonant and V is a vowel.
Early speech research showed that the phoneme identification ability with nonsense syllables is correlated with general sentence understanding (Fletcher and Steinberg, 1929, Fig. 11).
Confusion Matrices
The results from a closed-set test may be recorded (e.g., Miller & Nicely, 1955) as a two-dimensional table of confusion counts. A table element with index (s, r) shows how many times the listener responded by the rth category, when the sth stimulus was presented.
If the test includes several subgroups of items with minimal contrast, and responses are allowed only among the alternatives in each such subgroup (e.g., "Rhyme Test", Fairbanks, 1958), a complete confusion matrix would be very sparse, with nonzero counts only in block matrices along the main diagonal. Then it is most convenient to record the test results in a "long-format" table, with a separate row for each stimulus-response pair.
The statistical analysis of closed-set identification data is non-trivial, because the matrix is usually quite sparse for each listener. For example, in a consonant-identification test with 16 consonants, each stimulus type might be presented, say, five times. Then each matrix row will have 11 - 15 elements with a zero count. Furthermore, the response counts for each stimulus category are statistically dependent.
This makes it difficult to estimate underlying response probabilities and to quantify the statistical reliability of observed test results. The Bayesian analysis method handles these problems in a coherent manner.
Analysis Results
-
Overall performance is indicated by two measures, each with a credible range to indicate the uncertainty of the estimate, and credible differences between Test Conditions:
-
Probability of Correct identification (PC), across all presented test stimuli.
-
The Mutual Information (MI) between stimulus and response (Miller and Nicely, 1955), sometimes called "transmitted information". This measure indicates the average amount of information about the stimulus category, received by the listener by hearing each presented phoneme.
-
-
Detailed performance is shown by credible confusion pattern, i.e., a set of stimulus-response pairs where listeners' response probabilities are jointly credibly different between test conditions.
The Bayesian model is hierarchical. The package can estimate distributions of results for
- an unseen random individual in the population from which test participants were recruited,
- the mean of the population from which participants were recruited,
- an overview of the participant group recruited from each population,
- individual participants in each group.
Phoneme Identification Experiments
The package can analyze data from simple or rather complex experimental designs, including the following features:
-
Phoneme identification data may be collected in one or more Test Conditions. Each test condition may be a combination of categories from several Test Factors. For example, the main test factor may be Hearing Aid, with categories A, B, or Unaided. Another test factor may be, e.g., Background, with categories Quiet, or Noisy. The analysis shows credible differences between categories in user-selected test factor(s).
-
The study may involve one or more distinct Populations, from which separate groups of participants have been recruited.
-
Populations are distinguished by a combination of categories from one or more Group Factors. For example, one dimension may be Age, with categories Young, Middle, or Old. Another dimension may be, e.g., Hearing Loss, with categories None, Mild, or Moderate. The analysis shows credible differences between categories in user-selected group factor(s)
-
The analysis model does not require anything about the number of participants or the number of presentations for each stimulus category. The analysis estimates the statistical credibility of all results, given the limited amount of collected data. Of course, the reliability is improved if there are many participants from each population, each tested with a large number of stimulus item.
Package Documentation
General information is given in the package doc-string that may be accessed by command
help(ConfMatrixCalc).
The template script run_cm.py includes comments
explaining the required user input.
Input data can be accessed from files in several
of the formats that package Pandas can handle,
e.g., .csv, .xlsx.
The simulation script run_sim.py generates data files
that illustrate the preferred style
of tabulated test results.
After running an analysis, the logging output briefly explains the analysis results presented in figures and tables.
Usage
- Install the most recent package version:
python3 -m pip install --upgrade ConfMatrixCalc
-
For an introduction to the input data format, you may want to study and run the included simulation script:
python3 run_sim.py. For an introduction to the possible analysis results, you can then analyse the simulated data set by running the template script:python3 run_cm.py. -
To analyse a real data set, copy the template script
run_cm.pyto your work directory, rename it, and edit the copy as guided by comments in the template, to specify- all stimuli and closed-set response categories, and your experimental design,
- the top input data directory,
- a directory where all output result files will be stored.
- your desired set of analysis results.
-
Run your edited script:
python3 run_my_cm.py.
Requirements
This package requires Python 3.12 or later, using Numpy, Scipy, and Matplotlib, as well as a support package samppy, and the Openpyxl package for reading data from Excel workbook documents. The pip installer will check and install these required packages if needed.
Pandas can also read input files and write result tables in some other formats, but may then need other support packages that must be installed manually.
New in current version 0.8
- This version can analyse either complete confusion matrices (like Miller & Nicely, 1955), or results from tests using several subsets of stimuli, each with a closed response set (Fairbanks, 1958; Witte et al., 2024).
- Input and output files are handled by the Pandas package.
- The analysis can use either a point- or sample-estimated model of response probabilities in each population.
References
A. Leijon, G. E. Henter, and M. Dahlquist (2016). Bayesian analysis of phoneme confusion matrices. IEEE Trans Audio, Speech, and Language Proc 24(3):469–482. doi: 10.1109/TASLP.2015.2512039. download
A. Leijon (2026). ConfMatrixCalc --- Bayesian Analysis of Phoneme Identification Data. Documentation: Theory, Validation, and Computational Details. Contact the author for a copy.
G. Fairbanks. Test of phonemic differentiation: The rhyme test. J Acoust Soc Amer 30(7):596–600, 1958. doi: 10.1121/1.1909702.
H. Fletcher and J. Steinberg (1929). Articulation testing methods. Bell System Technical Journal 8:806–854. doi: 10.1002/j.1538-7305.1929.tb01246.x.
G. A. Miller and P. E. Nicely (1955). An analysis of perceptual confusions among some English consonants. J Acoust Soc Amer 27(2):338–352, 1955. doi: 10.1121/1.1907526.
E. Witte, J. Ekeroot, and S. Köbler. The development of linguistic stimuli for the Swedish situated phoneme test. Nordic Journal of Linguistics, 47(1):73–110, 2024. doi:10.1017/S0332586521000275. download
Acknowledgment
This Python package is a generalization of a similar MatLab package, developed by Arne Leijon for ORCA Europe, Widex A/S, Stockholm, Sweden. The MatLab development was financially supported by Widex A/S, Denmark.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file confmatrixcalc-0.8.0.tar.gz.
File metadata
- Download URL: confmatrixcalc-0.8.0.tar.gz
- Upload date:
- Size: 68.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df70682377e24cba42c5596bc41d1d299fbe8c8d686eb27a489bcb66c87492c0
|
|
| MD5 |
9b73e95d450c115990c29553fd3e9a37
|
|
| BLAKE2b-256 |
66610a2d03b9362ffe203ed49706766818e3e75c5aaaf930ef5d19fe484bebf0
|
File details
Details for the file confmatrixcalc-0.8.0-py3-none-any.whl.
File metadata
- Download URL: confmatrixcalc-0.8.0-py3-none-any.whl
- Upload date:
- Size: 76.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e56b4c4a00c1147c0e9204f2ee37f39c5ac68049753980288858129808f4d13
|
|
| MD5 |
70ebd7cdbf705d8d2d41023a193d5b51
|
|
| BLAKE2b-256 |
7303b17fca41f49b1c1b4606c7d5cd47ea5fd2f7c79bca3e6828d15e53688a68
|