The Phoneme Discovery Benchmark
Project description
The Phoneme Discovery benchmark
[💾 Website] [📜 Paper] [📖 BibTex]
Introduction
The last several years have seen revolutionary improvements in both speech processing and textual natural language processing. In both cases, unsupervised or self-supervised pre-training has been the key to models autonomously discovering representations that are tremendously useful for doing language tasks. Yet, central to the study of human speech processing is the phoneme inventory, a small set of discrete units that abstract away from massive pronunciation variability in the signal.
Discovering the correct set of phonemes for a language is crucial: encode the wrong categories, and contrasts between words are distorted or disappear; fail to categorize at all, and contrasts between words are hidden behind semantically irrelevant variation in the signal. While much attention has been paid to whether unsupervised speech models’ (continuous or discrete) representations are predictive of phonemes, this benchmark, for the first time, explicitly fixes the goal of learning a discrete set of categories that are in one-to-one correspondence with the phoneme inventory of a language.
Infants appear to learn the phoneme inventory of their language effortlessly, before they can speak. They benefit from millions of years of evolution of the human brain and body, giving them a learning architecture that allows them to thrive in the face of scarce and noisy language data, preparing them to learn the phoneme inventory of any human language.
The Phoneme Discovery benchmark is aimed at building models that discover phoneme inventories across various languages, using only small amounts of speech data, and without textual data during training.
Installation
pip install discophon
To be able to compute ABX discriminabilities: pip install discophon[abx].
If you want to run baselines and have access to the utility scripts, clone this repository:
git clone https://github.com/bootphon/phoneme_discovery
cd phoneme_discovery
uv sync
# uv sync --all-extras --all-groups # If you want all dependencies
Usage
Check out the documentation:
Citation
Contact: benchmarks [at] cognitive-ml [dot] fr
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file discophon-0.0.3.tar.gz.
File metadata
- Download URL: discophon-0.0.3.tar.gz
- Upload date:
- Size: 160.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4b596bc3173462d02425511131a47f6a2f962c707f586244970541407e220c9
|
|
| MD5 |
0c6b0b9b0ae3f9fdd5ee7af974596328
|
|
| BLAKE2b-256 |
c4f87f4f276604ca988f81627cde2c93e535e9808c7bbf74dd15833fb705341e
|
File details
Details for the file discophon-0.0.3-py3-none-any.whl.
File metadata
- Download URL: discophon-0.0.3-py3-none-any.whl
- Upload date:
- Size: 22.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64ce22d26507cb9014dd2e09ea33454319468ce35409987923c9e098f7ba2912
|
|
| MD5 |
6950ac25238619170594758388643bf7
|
|
| BLAKE2b-256 |
01f49d18552f4d7fb02d36def48bc0b6944f0fe14f69eb90edd6222ec81122f9
|