Evaluating ASR (automatic speech recognition) hypotheses, i.e. computing word error rate.

These details have not been verified by PyPI

Project description

asr_evaluation

Python module for evaluting ASR hypotheses (i.e. word error rate and word recognition rate).

This module depends on the editdistance project, for computing edit distances between arbitrary sequences.

The formatting of the output of this program is very loosely based around the same idea as the align.c program commonly used within the Sphinx ASR community. This may run a bit faster if neither instances nor confusions are printed.

Please let me know if you have any comments, questions, or problems.

Output

The program outputs three standard measurements:

Word error rate (WER)
Word recognition rate (the number of matched words in the alignment divided by the number of words in the reference).
Sentence error rate (SER) (the number of incorrect sentences divided by the total number of sentences).

Installing & uninstalling

The easiest way to install is using pip:

pip install asr-evaluation

Alternatively you can clone this git repo and install using distutils:

git clone git@github.com:belambert/asr-evaluation.git
cd asr-evaluation
python setup.py install

To uninstall with pip:

pip uninstall asr-evaluation

Command line usage

For command line usage, see:

    wer --help

It should display something like this:

usage: wer [-h] [-i | -r] [--head-ids] [-id] [-c] [-p] [-m count] [-a] [-e]
           ref hyp

Evaluate an ASR transcript against a reference transcript.

positional arguments:
  ref                   Reference transcript filename
  hyp                   ASR hypothesis filename

optional arguments:
  -h, --help            show this help message and exit
  -i, --print-instances
                        Print all individual sentences and their errors.
  -r, --print-errors    Print all individual sentences that contain errors.
  --head-ids            Hypothesis and reference files have ids in the first
                        token? (Kaldi format)
  -id, --tail-ids, --has-ids
                        Hypothesis and reference files have ids in the last
                        token? (Sphinx format)
  -c, --confusions      Print tables of which words were confused.
  -p, --print-wer-vs-length
                        Print table of average WER grouped by reference
                        sentence length.
  -m count, --min-word-count count
                        Minimum word count to show a word in confusions.
  -a, --case-insensitive
                        Down-case the text before running the evaluation.
  -e, --remove-empty-refs
                        Skip over any examples where the reference is empty.

Contributing and code of conduct

For contributions, it's best to Github issues and pull requests. Proper testing and documentation suggested.

Code of conduct is expected to be reasonable, especially as specified by the Contributor Covenant

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

2.0.4

Oct 31, 2018

2.0.3

Oct 22, 2018

2.0.2

Oct 8, 2017

2.0.1

Jul 30, 2017

2.0.0

Apr 18, 2017

1.2.2

Apr 18, 2017

1.2.1

Apr 17, 2017

1.2.0

Apr 17, 2017

1.1.0

Mar 4, 2017

1.0.1

Feb 21, 2017

1.0.0

Jan 29, 2017

0.2.5

Jan 26, 2017

0.2.4

Jan 8, 2017

0.2.3

Dec 31, 2016

0.2.2

Dec 31, 2016

0.2.1

Dec 31, 2016

0.2.0

Dec 31, 2016

0.1.1

Dec 30, 2016

0.1.0

Jul 10, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asr_evaluation-2.0.4.tar.gz (8.2 kB view details)

Uploaded Oct 31, 2018 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

asr_evaluation-2.0.4-py3-none-any.whl (9.1 kB view details)

Uploaded Oct 31, 2018 Python 3

File details

Details for the file asr_evaluation-2.0.4.tar.gz.

File metadata

Download URL: asr_evaluation-2.0.4.tar.gz
Upload date: Oct 31, 2018
Size: 8.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.0

File hashes

Hashes for asr_evaluation-2.0.4.tar.gz
Algorithm	Hash digest
SHA256	`9b7ae0d1a267d25a13acf25e8de13535aedddb292b7aeaa8ed2c22bf51f27aed`
MD5	`0446376c35347e705fba8592655a43cf`
BLAKE2b-256	`ebbf4e8a1e34edc3ace9cc090466963c6baadc734d39cbfb5e84ead1788474b8`

See more details on using hashes here.

File details

Details for the file asr_evaluation-2.0.4-py3-none-any.whl.

File metadata

Download URL: asr_evaluation-2.0.4-py3-none-any.whl
Upload date: Oct 31, 2018
Size: 9.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.0

File hashes

Hashes for asr_evaluation-2.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`612a07ee81290bef7f910ffc4c70547f9fd071dfa10ec0dd75102b4ad6b2174f`
MD5	`e691680412af9877a5a0021662b39ae0`
BLAKE2b-256	`36233ab0b79dc4cec58412583bb9477ff37b655ec1f50390dc27400db95a7a14`

See more details on using hashes here.

asr_evaluation 2.0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

asr_evaluation

Output

Installing & uninstalling

Command line usage

Contributing and code of conduct

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes