Skip to main content

The Phoneme Discovery Benchmark

Project description

DiscoPhon

Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

arXiv · GitHub · Website

DiscoPhon is a multilingual benchmark evaluating unsupervised phoneme discovery from discrete speech units. Given only 10 hours of speech in an unseen language, models must produce discrete units that map to a predefined phoneme inventory.

Getting started

References

@misc{poli2026discophon,
  title={{DiscoPhon}: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units},
  author={Maxime Poli and Manel Khentout and Angelo Ortiz Tandazo and Ewan Dunbar and Emmanuel Chemla and Emmanuel Dupoux},
  year={2026},
  eprint={2603.18612},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2603.18612},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

discophon-0.0.10.tar.gz (581.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

discophon-0.0.10-py3-none-any.whl (33.4 kB view details)

Uploaded Python 3

File details

Details for the file discophon-0.0.10.tar.gz.

File metadata

  • Download URL: discophon-0.0.10.tar.gz
  • Upload date:
  • Size: 581.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for discophon-0.0.10.tar.gz
Algorithm Hash digest
SHA256 f3dd2465286c3d2bf9f5c846d9a775114cdfd87c459bcd4a57cc842a577ed9f5
MD5 e6a01eaa2713ff38eb1fa2abe59055e8
BLAKE2b-256 102b0f2ea666d5928e5bd00edd33913a08a0ff33bb6d086a7eb939de1f55269a

See more details on using hashes here.

File details

Details for the file discophon-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: discophon-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 33.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for discophon-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 985f3280873c096d0d97486ea2b5fe6ed64f02b4861e9accfc8cd253fb2849dd
MD5 5b4a96d33a9363d3856538a8aea587e1
BLAKE2b-256 06a8050b6be4cb596642b342d94e0a1700b6c34e6aa979e9fb105f065402330b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page