A small python implementation of common ASR corrections
Project description
CC - CommonCorrections
A simple repo that is used to correct common ASR outputs. The aim is not on mistakes but different ways of transcribing the same thing with a focus on how something may sound as opposed to the shortened form. The primary use case is to align the ground-truth and output from ASRs just before the WER is calculated.
Static Examples
there's -> there is
google.com -> google dot com
Dynamic Examples
1 2 3 -> one two three
53.4 -> fifty three point four
23:59 -> twenty three fifty nine
Features
- Designed to be used and fast (ish) with Pandas dataframes
- Lots of built in corrections for free
- Ability to easily extend with private corrections
Getting Started
- Install with:
pip install commoncorrections
- Import with:
import commoncorrections
- Use with:
>>> wer("the cat sat on the mat", "the mat sat on the cat")
0.3333333333333333
mypy Type Checks
I tested installing mypy to check that types are compatible
(py) rob@rob-T480s:~/projects/CommonCorrections/commoncorrections (master)$ mypy commoncorrections.py
Success: no issues found in 1 source file
Change Log
- v1.0.0 - First release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for commoncorrections-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 443aa30253e24d4b130c8d4afd8806eac8b098e92e3b2230975c22255e6a5c9c |
|
MD5 | 13ee3883be3ff7634a938f2705ed734a |
|
BLAKE2b-256 | 739334eff58bcfe7b6be3c59d0f133e88db80f76f2eb674df117fe7eb7d9db7c |