Skip to main content

Statistical Analysis of Phoneme Discrimination Data

Project description

Package ConfMatrixCalc implements probabilistic Bayesian analysis of closed-set identification / discrimination tests, in which the results may be recorded in the form of confusion matrices. The package estimates probabilities of correct response for each test item, and probabilities for all allowed incorrect response alternatives for each stimulus.

The package was developed for analyzing phoneme identification test results, but it can be used to analyze results from any type of closed-set classification, performed by either humans or machines. The analysis approach was presented and validated in (Leijon et al., 2016).

Phoneme identification tests are used, for example, to evaluate the detailed ("microscopic") speech-recognition ability of listeners using two or more different hearing aids or other sound-transmission instruments or algorithms. Phoneme identification performance can be tested with real words with minimal phonemic contrasts (e.g., Fairbanks, 1958; Witte et al., 2024), or using nonsense "words" with a fixed structure, e.g., CV (Miller & Nicely, 1955), or VCV, or CVCVC, where C is a consonant and V is a vowel.

Early speech research showed that the phoneme identification ability with nonsense syllables is correlated with general sentence understanding (Fletcher and Steinberg, 1929, Fig. 11).

Confusion Matrices

The results from a closed-set test may be recorded (e.g., Miller & Nicely, 1955) as a two-dimensional table of confusion counts. A table element with index (s, r) shows how many times the listener responded by the rth category, when the sth stimulus was presented.

If the test includes several subgroups of items with minimal contrast, and responses are allowed only among the alternatives in each such subgroup (e.g., "Rhyme Test", Fairbanks, 1958), a complete confusion matrix would be very sparse, with nonzero counts only in block matrices along the main diagonal. Then it is most convenient to record the test results in a "long-format" table, with a separate row for each stimulus-response pair.

The statistical analysis of closed-set identification data is non-trivial, because the matrix is usually quite sparse for each listener. For example, in a consonant-identification test with 16 consonants, each stimulus type might be presented, say, five times. Then each matrix row will have 11 - 15 elements with a zero count. Furthermore, the response counts for each stimulus category are statistically dependent.

This makes it difficult to estimate underlying response probabilities and to quantify the statistical reliability of observed test results. The Bayesian analysis method handles these problems in a coherent manner.

Analysis Results

  1. Overall performance is indicated by two measures, each with a credible range to indicate the uncertainty of the estimate, and credible differences between Test Conditions:

    1. Probability of Correct identification (PC), across all presented test stimuli.

    2. The Mutual Information (MI) between stimulus and response (Miller and Nicely, 1955), sometimes called "transmitted information". This measure indicates the average amount of information about the stimulus category, received by the listener by hearing each presented phoneme.

  2. Detailed performance is shown by credible confusion pattern, i.e., a set of stimulus-response pairs where listeners' response probabilities are jointly credibly different between test conditions.

The Bayesian model is hierarchical. The package can estimate distributions of results for

  • an unseen random individual in the population from which test participants were recruited,
  • the mean of the population from which participants were recruited,
  • an overview of the participant group recruited from each population,
  • individual participants in each group.

Phoneme Identification Experiments

The package can analyze data from simple or rather complex experimental designs, including the following features:

  1. Phoneme identification data may be collected in one or more Test Conditions. Each test condition may be a combination of categories from several Test Factors. For example, the main test factor may be Hearing Aid, with categories A, B, or Unaided. Another test factor may be, e.g., Background, with categories Quiet, or Noisy. The analysis shows credible differences between categories in user-selected test factor(s).

  2. The study may involve one or more distinct Populations, from which separate groups of participants have been recruited.

  3. Populations are distinguished by a combination of categories from one or more Group Factors. For example, one dimension may be Age, with categories Young, Middle, or Old. Another dimension may be, e.g., Hearing Loss, with categories None, Mild, or Moderate. The analysis shows credible differences between categories in user-selected group factor(s)

  4. The analysis model does not require anything about the number of participants or the number of presentations for each stimulus category. The analysis estimates the statistical credibility of all results, given the limited amount of collected data. Of course, the reliability is improved if there are many participants from each population, each tested with a large number of stimulus item.

Package Documentation

General information is given in the package doc-string that may be accessed by command help(ConfMatrixCalc). The template script run_cm.py includes comments explaining the required user input.

Input data can be accessed from files in several of the formats that package Pandas can handle, e.g., .csv, .xlsx. The simulation script run_sim.py generates data files that illustrate the preferred style of tabulated test results.

After running an analysis, the logging output briefly explains the analysis results presented in figures and tables.

Usage

  1. Install the most recent package version:

python3 -m pip install --upgrade ConfMatrixCalc

  1. For an introduction to the input data format, you may want to study and run the included simulation script: python3 run_sim.py. For an introduction to the possible analysis results, you can then analyse the simulated data set by running the template script: python3 run_cm.py.

  2. To analyse a real data set, copy the template script run_cm.py to your work directory, rename it, and edit the copy as guided by comments in the template, to specify

    • all stimuli and closed-set response categories, and your experimental design,
    • the top input data directory,
    • a directory where all output result files will be stored.
    • your desired set of analysis results.
  3. Run your edited script: python3 run_my_cm.py.

Requirements

This package requires Python 3.12 or later, using Numpy, Scipy, and Matplotlib, as well as a support package samppy, and the Openpyxl package for reading data from Excel workbook documents. The pip installer will check and install these required packages if needed.

Pandas can also read input files and write result tables in some other formats, but may then need other support packages that must be installed manually.

New in current version 0.8

  1. This version can analyse either complete confusion matrices (like Miller & Nicely, 1955), or results from tests using several subsets of stimuli, each with a closed response set (Fairbanks, 1958; Witte et al., 2024).
  2. Input and output files are handled by the Pandas package.
  3. The analysis can use either a point- or sample-estimated model of response probabilities in each population.

References

A. Leijon, G. E. Henter, and M. Dahlquist (2016). Bayesian analysis of phoneme confusion matrices. IEEE Trans Audio, Speech, and Language Proc 24(3):469–482. doi: 10.1109/TASLP.2015.2512039. download

A. Leijon (2026). ConfMatrixCalc --- Bayesian Analysis of Phoneme Identification Data. Documentation: Theory, Validation, and Computational Details. Contact the author for a copy.

G. Fairbanks. Test of phonemic differentiation: The rhyme test. J Acoust Soc Amer 30(7):596–600, 1958. doi: 10.1121/1.1909702.

H. Fletcher and J. Steinberg (1929). Articulation testing methods. Bell System Technical Journal 8:806–854. doi: 10.1002/j.1538-7305.1929.tb01246.x.

G. A. Miller and P. E. Nicely (1955). An analysis of perceptual confusions among some English consonants. J Acoust Soc Amer 27(2):338–352, 1955. doi: 10.1121/1.1907526.

E. Witte, J. Ekeroot, and S. Köbler. The development of linguistic stimuli for the Swedish situated phoneme test. Nordic Journal of Linguistics, 47(1):73–110, 2024. doi:10.1017/S0332586521000275. download

Acknowledgment

This Python package is a generalization of a similar MatLab package, developed by Arne Leijon for ORCA Europe, Widex A/S, Stockholm, Sweden. The MatLab development was financially supported by Widex A/S, Denmark.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

confmatrixcalc-0.8.0.tar.gz (68.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

confmatrixcalc-0.8.0-py3-none-any.whl (76.0 kB view details)

Uploaded Python 3

File details

Details for the file confmatrixcalc-0.8.0.tar.gz.

File metadata

  • Download URL: confmatrixcalc-0.8.0.tar.gz
  • Upload date:
  • Size: 68.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for confmatrixcalc-0.8.0.tar.gz
Algorithm Hash digest
SHA256 df70682377e24cba42c5596bc41d1d299fbe8c8d686eb27a489bcb66c87492c0
MD5 9b73e95d450c115990c29553fd3e9a37
BLAKE2b-256 66610a2d03b9362ffe203ed49706766818e3e75c5aaaf930ef5d19fe484bebf0

See more details on using hashes here.

File details

Details for the file confmatrixcalc-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: confmatrixcalc-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 76.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for confmatrixcalc-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3e56b4c4a00c1147c0e9204f2ee37f39c5ac68049753980288858129808f4d13
MD5 70ebd7cdbf705d8d2d41023a193d5b51
BLAKE2b-256 7303b17fca41f49b1c1b4606c7d5cd47ea5fd2f7c79bca3e6828d15e53688a68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page