Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
Project description
JiWER
JiWER is a simple and fast python package to evaluate an automatic speech recognition system. It supports the following measures:
- word error rate (WER)
- match error rate (MER)
- word information lost (WIL)
- word information preserved (WIP)
- character error rate (CER)
These measures are computed with the use of the minimum-edit distance between one or more reference and hypothesis sentences. The minimum-edit distance is calculated using RapidFuzz, which uses C++ under the hood, and is therefore faster than a pure python implementation.
Documentation
For further info, see the documentation at jitsi.github.io/jiwer.
Installation
You should be able to install this package using poetry:
$ poetry add jiwer
Or, if you prefer old-fashioned pip and you're using Python >= 3.7
:
$ pip install jiwer
Usage
The most simple use-case is computing the word error rate between two strings:
from jiwer import wer
reference = "hello world"
hypothesis = "hello duck"
error = wer(reference, hypothesis)
Licence
The jiwer package is released under the Apache License, Version 2.0
licence by 8x8.
For further information, see LICENCE
.
Reference
For a comparison between WER, MER and WIL, see:
Morris, Andrew & Maier, Viktoria & Green, Phil. (2004). From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.