Skip to main content

The Phoneme Discovery Benchmark

Project description

DiscoPhon

Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

arXiv · GitHub · Website

DiscoPhon is a multilingual benchmark evaluating unsupervised phoneme discovery from discrete speech units. Given only 10 hours of speech in an unseen language, models must produce discrete units that map to a predefined phoneme inventory.

Getting started

References

@misc{poli2026discophon,
  title={{DiscoPhon}: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units},
  author={Maxime Poli and Manel Khentout and Angelo Ortiz Tandazo and Ewan Dunbar and Emmanuel Chemla and Emmanuel Dupoux},
  year={2026},
  eprint={2603.18612},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2603.18612},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

discophon-0.0.7.tar.gz (343.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

discophon-0.0.7-py3-none-any.whl (29.1 kB view details)

Uploaded Python 3

File details

Details for the file discophon-0.0.7.tar.gz.

File metadata

  • Download URL: discophon-0.0.7.tar.gz
  • Upload date:
  • Size: 343.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for discophon-0.0.7.tar.gz
Algorithm Hash digest
SHA256 ca46ebd534fefce595ff16ea613e5dc076eee8daab1f9e495bb38b3b22106dcd
MD5 a9b55db8eeb3931d14b28e6d90cc814e
BLAKE2b-256 c82447aa8981ede39e2cde7401411c22ce76d1107905e41c7073ee5a1621e865

See more details on using hashes here.

File details

Details for the file discophon-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: discophon-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 29.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for discophon-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 38f6d3cb1249d3ace33d127340df7df2c3934b859f3bab0212fcd382e04af710
MD5 b897a371b3bcf816c31bf43e7b894271
BLAKE2b-256 440ccef738d8c50026aa8f1c36dba3461caa2cf761ef011615e7bc1458d56588

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page