Skip to main content

The Phoneme Discovery Benchmark

Project description

DiscoPhon

Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

arXiv · GitHub · Website

DiscoPhon is a multilingual benchmark evaluating unsupervised phoneme discovery from discrete speech units. Given only 10 hours of speech in an unseen language, models must produce discrete units that map to a predefined phoneme inventory.

Getting started

References

@misc{poli2026discophon,
  title={{DiscoPhon}: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units},
  author={Maxime Poli and Manel Khentout and Angelo Ortiz Tandazo and Ewan Dunbar and Emmanuel Chemla and Emmanuel Dupoux},
  year={2026},
  eprint={2603.18612},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2603.18612},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

discophon-0.0.8.tar.gz (362.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

discophon-0.0.8-py3-none-any.whl (31.8 kB view details)

Uploaded Python 3

File details

Details for the file discophon-0.0.8.tar.gz.

File metadata

  • Download URL: discophon-0.0.8.tar.gz
  • Upload date:
  • Size: 362.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for discophon-0.0.8.tar.gz
Algorithm Hash digest
SHA256 d3e86cc7f0ae3a9b34a529b7857c186cca5962ab6ba78191d5b6117785165fa9
MD5 7387f356699668330bff4a91072de7af
BLAKE2b-256 d740ae0dc2a3ac4b4f6e09b2b15fa2fcab36a3397c4ef3b8d4fabb994a94b911

See more details on using hashes here.

File details

Details for the file discophon-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: discophon-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 31.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for discophon-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 f0dff11f880e7a4347f4efbf4bbe8636eb8b69171fafb07343feaefa024c3bad
MD5 b9ebf85aa20816dac9a4fb9c1e1fd162
BLAKE2b-256 13b4e07eedde83f8d174c63e17e088f17bef4d60192ffaab8c101f546d664e66

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page