Skip to main content

The Phoneme Discovery Benchmark

Project description

DiscoPhon

Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

arXiv · GitHub · Website

DiscoPhon is a multilingual benchmark evaluating unsupervised phoneme discovery from discrete speech units. Given only 10 hours of speech in an unseen language, models must produce discrete units that map to a predefined phoneme inventory.

Getting started

References

@misc{poli2026discophon,
  title={{DiscoPhon}: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units},
  author={Maxime Poli and Manel Khentout and Angelo Ortiz Tandazo and Ewan Dunbar and Emmanuel Chemla and Emmanuel Dupoux},
  year={2026},
  eprint={2603.18612},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2603.18612},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

discophon-0.0.11.tar.gz (751.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

discophon-0.0.11-py3-none-any.whl (39.2 kB view details)

Uploaded Python 3

File details

Details for the file discophon-0.0.11.tar.gz.

File metadata

  • Download URL: discophon-0.0.11.tar.gz
  • Upload date:
  • Size: 751.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for discophon-0.0.11.tar.gz
Algorithm Hash digest
SHA256 75bb927ca75c8a4846a9fceb67633b7c2d78528af9650bae9fded9bdd35e2a0a
MD5 009c77e43402bcb95fcc64865a47556e
BLAKE2b-256 d77ed5f7d4b9b4f00d857cf6545e4fa2d1b1e27be4dc14244dbedca61890d7b3

See more details on using hashes here.

File details

Details for the file discophon-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: discophon-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 39.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for discophon-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 5c38bd6b6b44049f8586006399e11827172bc90f7733a0259ce5931491f81588
MD5 2a1bf4edbcbaf8aeea9c3464215eb1d3
BLAKE2b-256 aaf525573368475283893b4faf0de18d3a0ec65935d82c93095281357356302d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page