Skip to main content

A package containing generic error-rate functions implemented in Rust

Project description

Universal Edit Distance

Universal Edit Distance (or UED)(sometimes called Universal Error Rate because I struggle to be consistent) is a project aimed at creating a simple Python evaluation library in Rust.

The universal part of the name comes from the fact that the Rust implementation is generic and works on any data type that implements PartialEq as opposed to most implementations that are limited to only strings.

✨ Features

  • Much quicker than HuggingFace's evaluate library (see benchmarks below)
  • Word-error-rate and Character-error-rate functions compatible and comparable with evaluate's wer and cer metrics.
  • Functions that return the wer or cer for every test as an array within a fraction of second.
  • Functions that return the edit distance for every test as an array of integers within the fraction of a second.
  • Generic implementations of the mean-error-rate and error-rate metrics that can work with any* Python type
  • Includes type-hints to make development easier.

* I am pretty sure it works with any type, but it is still being tested

⚡️ Quick start

You can now install the library using PyPI! All you need to do is to install the package universal-edit-distance using your favourite package manager.

Note: Pre-compied wheels exists for MacOS and Linux for Python version 3.9 -> 3.13. Builds are not currently working for Windows, so if you are on Windows the package has to be installed from source which requires cargo to be installed in your environment.

Using pip

pip install universal-edit-distance

Using uv

uv add universal-edit-distance

You should now be able to import the module universal_edit_distance in your Python project.

🎯 Motivation and why this project exists

I love statistics, and I when I evaluate my speech-recognition models (and other models) I like to run t-tests etc. However, doing that with HuggingFace's evaluate library while possible is horrendously slow.

If you only require the mean CER or WER you could continue using evaluate and your life would be fine. If you want to be more rigorous in your testing and evaluation, you should consider using this library.

In addition, one thing that annoys me with a lot of Levenshtein implementations is that the algorithm can literally work on any data type that supports comparison. I have tried to make the implementation found here as generic as possible.

Benchmarks

You can find the benchmarking script here: prebens-phd-adventures/ued-benchmarks

Note that the single floating point result normally returned from evaluate is in this library and in these results called the mean-error-rate since it is effectively the mean across all tests as opposed to only a single test. The tests returning a floating point result for each row in the test case is simply called error-rate.

The tests in the table below were run using evaluate=0.4.3, jiwer=3.1.0, and universal-edit-distance 0.2.0 on a Polars DataFrame containing $n=12775$ entries. For the mean-error-rate results the tests were run 100 times per, and for the error-rate results they were only run once due to evaluate being too slow.

Metric evaluate jiwer ued Speed-up vs evaluate Speed-up vs jiwer
Mean WER 0.31s 0.16s 0.02s 15.28x 7.75x
Mean CER 0.45s 0.24s 0.09s 5.01x 2.60x
WER 24.77s 0.27s 0.02s 1137.30x 12.61x
CER 25.34s 0.37s 0.09s 278.97x 4.03x

As can be seen in the table, ued beats evaluate and jiwer in basically every metric. The goal of the project was to make WER and CER faster, but I'll take the w for the other two. What you'll also notice is that the results for the mean-error-rates and error-rates are the same for ued. That is due to the way it is implemented and is expected.

👩‍💻👨‍💻 Contribute to the project

This is my first ever Rust project, so I while I have a vague idea about what I am doing, I am sure it can be improved. If you have any suggestions or requests please feel free to add an issue!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

universal_edit_distance-0.3.3.tar.gz (2.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

universal_edit_distance-0.3.3-cp312-cp312-musllinux_1_2_x86_64.whl (231.7 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

universal_edit_distance-0.3.3-cp312-cp312-musllinux_1_2_aarch64.whl (218.5 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

universal_edit_distance-0.3.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (233.0 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

universal_edit_distance-0.3.3-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl (250.4 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ i686

universal_edit_distance-0.3.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (221.7 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

universal_edit_distance-0.3.3-cp312-cp312-macosx_10_12_x86_64.whl (684.5 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

universal_edit_distance-0.3.3-cp311-cp311-musllinux_1_2_x86_64.whl (232.6 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

universal_edit_distance-0.3.3-cp311-cp311-musllinux_1_2_aarch64.whl (219.3 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ ARM64

universal_edit_distance-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (233.9 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

universal_edit_distance-0.3.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (251.2 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ i686

universal_edit_distance-0.3.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (222.3 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

universal_edit_distance-0.3.3-cp311-cp311-macosx_10_12_x86_64.whl (688.3 kB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

universal_edit_distance-0.3.3-cp310-cp310-musllinux_1_2_x86_64.whl (232.6 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

universal_edit_distance-0.3.3-cp310-cp310-musllinux_1_2_aarch64.whl (219.2 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ ARM64

universal_edit_distance-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (233.9 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

universal_edit_distance-0.3.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (250.9 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ i686

universal_edit_distance-0.3.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (222.3 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

universal_edit_distance-0.3.3-cp310-cp310-macosx_10_12_x86_64.whl (688.1 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

universal_edit_distance-0.3.3-cp39-cp39-musllinux_1_2_x86_64.whl (233.0 kB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

universal_edit_distance-0.3.3-cp39-cp39-musllinux_1_2_aarch64.whl (219.2 kB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ ARM64

universal_edit_distance-0.3.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (251.6 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

universal_edit_distance-0.3.3-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl (250.9 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ i686

universal_edit_distance-0.3.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (222.3 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ ARM64

universal_edit_distance-0.3.3-cp39-cp39-macosx_10_12_x86_64.whl (689.0 kB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

File details

Details for the file universal_edit_distance-0.3.3.tar.gz.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3.tar.gz
Algorithm Hash digest
SHA256 7cd6a296fe80597b0ae128274ead87a87e32805dae05e8500b0ada80857c3a73
MD5 3db578c36bba9d4f059c4234c43d9740
BLAKE2b-256 06c77097cf482a050bc7262adea97660e7c0c6b6e08f66562c000fedf82f9c6b

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 017acd8b8daabad1a6f27cfa115ff433f33ca7733803d71138c77b26da51a1c1
MD5 e7f56a50a65add45f5929d420945bb5e
BLAKE2b-256 ed9fca2bd56f556612d21c073d1851414cb82aff85abf6e32970423d24b9f471

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 8407c5411a3d39bf648b0bb25ad88752bb46449fdae6dff292979ced6e085772
MD5 ad9c768c2aad4c2b5e5bfe3d1b7cdfa1
BLAKE2b-256 13de8ae8aa9d8af075dfbdcf6c4d29c98aa1e300bdf5d6ff40c9a54f17fd4f18

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e4684545a8b02bdf2888548f3af33eafa4ff84db5babdb44f7c895ff6e755612
MD5 766aa31710372e7d9eab5f3448febf6f
BLAKE2b-256 2a6abdda00b59f5afb02c8836952be5167460d99237bdc0e46465ac1ac488620

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 c37fc1a5ba0120ac3622e242d125937503965a7525bd353f6b0bae61ea1b181c
MD5 d13fff89f0aee3badd6d3c5ea78a9d4d
BLAKE2b-256 53a0ca3e99dc03f7790e51b72b2bf63009bdf27eb7926e420b4046f39a38ce47

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 f4d5e286ae1a0a2c9ffd11f724276fd6fc4771b92c34c2ada8270cb0f383eee5
MD5 2096c5fdd18f9b41ccf9f87b62dad0b3
BLAKE2b-256 fcf1e3cc15ab008c3fc4f2b20fb2102c502ba72c60683f0c14c08ad7c110b8c3

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 fd0b8cab80d74d644cd617d650615bd794c5063d76b5ecf226aa30432075f270
MD5 cc3696df75a1132751d47767b90b8927
BLAKE2b-256 1180ed5718c87c9cd6d40ae0928f2b7aaebd4ac795ea7379203a63390ec6e339

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 4d8a5ffdbe3a8c06c2d35d909c6cf559345a927049eabc9b40878767809bf252
MD5 61a52438ef8f5c837ef8a24638b30e6f
BLAKE2b-256 32fe39c2c619d086cf2c320d4de3e2a0a98c53b971fc1bd188f3647018e16eda

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp311-cp311-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp311-cp311-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 a2d68ef70ebd9a926462a5c514fa2f93fae37d72def92d8847397cefa7a09ccb
MD5 a554598771f55a64723e2ad7b7d734e2
BLAKE2b-256 87786ece54530be026b7a9dd47c6c15bad2fb74b5f63db216ac91cacd0f964ee

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4c766ff88ef3bac1571deba831503bd4ea3f5794e4267d80dd5471cdbccf81f0
MD5 437664d0324290fbceb553f9bb09469a
BLAKE2b-256 cf89747952e2dfce51e0522c63c4be3c68d4a230b95f01fc3b4c3296c0c86eae

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 56f60b21e0bdb3d3b7580f0933e30fc6d568dc8cc18557dc5bd427c6bd59c7f9
MD5 c889d0edcb79474f60d86c6cc36a4e32
BLAKE2b-256 12c608b2d475b42d3c1a42012adace18fca4cc02af64fb899743d5f6d6da7bd3

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 edfe7b2dc993fc8facd537e4e58f34c7d53fe278fb698bc3edfce6e6709a7798
MD5 7cbdea7f401a3fe9f4824d5695005efc
BLAKE2b-256 2c4cd39518ecc9bee14acba7d287c8f0d6ae9c5529a0e66f333f043ee6ae92be

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 5ff20b8a16dccfc95f2a60e848c96c4572eac39b8159485acde6839a40dcae83
MD5 6d8e8d49392c6e3bbed8325d2c20f5f2
BLAKE2b-256 e8bb90d1dd53ee2c698f62c302c826be12b5da5651691e2a55a782291d130937

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 3a74476547efbb35f752c5f6a0d89d4c0cc6c7dfb33389eb1a5f59efb3b4ac30
MD5 57d333ffa0cde4426da0ee5871d5d469
BLAKE2b-256 64930fff804573c0d3786ed3d39c5638b008263cbc7e84eb6d730823ffa0a216

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp310-cp310-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp310-cp310-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 7ea4341b82f4a5c79cb17fb1c20d3daa7a512a6fc3ac38f71b692f875bc1542a
MD5 fd9f5ea8bac82ee7ed6db151a0fcc41f
BLAKE2b-256 53d94b8b3ff52d3759db7ea7c9b51b523771a01e4df4a62afe7d61c4514dfbef

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c121c260a284b5bb4dd9d672a94391a348ce89bb7d2d6d40c81739c0b4abcdbc
MD5 d564203ebd072be1ca66e4e5f01c3230
BLAKE2b-256 a3796518a93a84799b636c7a4eaff7e0e31b95cfe225f13dceaf4d38841f83fc

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 ced8a4e5a3ba44b6b2cef7036de37e3a6903083185a169b15e89eb22233e1d2b
MD5 b750a4521373032d709be3de5ecd9269
BLAKE2b-256 8625f25340a0139daa337d5c101687be285cb69923940837edb75719ac61cefc

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3afa2fceec169366c496fc849eb1174cf4a280738f07f87d53a980967b926bdb
MD5 81172704ebd0f581448556be7b68ce4f
BLAKE2b-256 630eb67dc2ade4e90e73f3b22ffd0e963b641c9b049b58cc9aec75edcebf3476

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a74ff76d533f46487bd0504aec0ec558226df377b3ae96bac53f0f99a2ca3132
MD5 0502c271b4364ddfffb9d7a967cab860
BLAKE2b-256 63c8508ef9bd60d398ba3be04baae1dec0ab4c50041932b58ad944759c268d47

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1d16a75c01a7a97a3c24133e1a29d978fe478589492b0af34d3e35fc7c0006f2
MD5 c6bd4d6c1e2f99e3d116919ea138c21a
BLAKE2b-256 290e78353108a51f4b3486a732a9d45cd870a5609ac08d335b1da7179d613ac0

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp39-cp39-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp39-cp39-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 5deb9f9a14ac6b2d1df59d272e19f3a6810b396a9f1f12ebe6056ceab3c8f5cc
MD5 8dd5f9dfe0a003b9fa37dc335d353721
BLAKE2b-256 8b6e6f61573ee6773540d9e410e4bdc9b4db3cc9ee4a4f632df0500a70bc36fe

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9227f97eae1a1f7de9c40603b130e4e49c75de7571ae79f8eb4321539a10e246
MD5 22a276d1a2135dde6139ebaa2faff8da
BLAKE2b-256 ee1354c9f1c829ca2b91d17852a3f8b6106bb2e50cc80e13e7589bf355f599cf

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 7f8df1a36c4982f74f86077fe65eea4f8592232957c2ab71927bf243661be6a5
MD5 12b94019ee7c07642d00cb6092ee7efa
BLAKE2b-256 7250af8f67dab9f3f329c23cb1faf886044d6f19d026160a24451ccab5a08f02

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 61b1a35c73fbdca6b2433ed347814d58e7913c6d8f6315a453e6985ff0d9f14c
MD5 9ac187194f453bd6f23149e38812a66d
BLAKE2b-256 4173fddd5e15665dcb637939dbb5c8280097b8337e5a631149ba953f126bc3db

See more details on using hashes here.

File details

Details for the file universal_edit_distance-0.3.3-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for universal_edit_distance-0.3.3-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ccb853ebe7a1b86b2609a4d758fd7064d088de392c018448d2efd7b6fde65629
MD5 466c53da448c9d9ef57c279c435a28ad
BLAKE2b-256 dccd8469b2f644f2a94ec63706c54905b093668f52c59331ab422aa44ec9304f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page