Skip to main content

The Phoneme Discovery Benchmark

Project description

DiscoPhon

Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

arXiv · GitHub · Website

DiscoPhon is a multilingual benchmark evaluating unsupervised phoneme discovery from discrete speech units. Given only 10 hours of speech in an unseen language, models must produce discrete units that map to a predefined phoneme inventory.

Getting started

References

@misc{poli2026discophon,
  title={{DiscoPhon}: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units},
  author={Maxime Poli and Manel Khentout and Angelo Ortiz Tandazo and Ewan Dunbar and Emmanuel Chemla and Emmanuel Dupoux},
  year={2026},
  eprint={2603.18612},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2603.18612},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

discophon-0.0.9.tar.gz (581.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

discophon-0.0.9-py3-none-any.whl (33.4 kB view details)

Uploaded Python 3

File details

Details for the file discophon-0.0.9.tar.gz.

File metadata

  • Download URL: discophon-0.0.9.tar.gz
  • Upload date:
  • Size: 581.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for discophon-0.0.9.tar.gz
Algorithm Hash digest
SHA256 db685ed3eb78cffd3d1de4f46e7b8abd8752e9b93faf88a1b030a99e8d1b463f
MD5 98a15690e387aaf5709aa71ded4576f7
BLAKE2b-256 e90ac911d7c878080c5ffd142d411278861f422ef1ca02e2967efde95dfff928

See more details on using hashes here.

File details

Details for the file discophon-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: discophon-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 33.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for discophon-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 0c2185b742ed8a63f3355d8078eb209cc3df718dee128b7642e36eb299dae14c
MD5 2be4dafaa7de5c1e2d4561758a080360
BLAKE2b-256 e278251c4849adfa7dd1d8d908749bc5ea5fa1fc7c1352842577fe4b9dbdad1d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page