🏆 Run benchmarks against the most common ASR tools on the market.
Project description
Rate That ASR (RTASR)
🏆 Run benchmarks against the most common ASR tools on the market.
Early Results
DER
As a first evaluation, we ran the der
metric on the voxconverse
dataset using only the test
split.
AssemblyAI, Deepgram, and Wordcab were evaluated almost on the whole split (224 files) while RevAI and Speechmatics were evaluated respectively on 29 and 7 files, due to respective API limitations and issues for Speechmatics...
Here are the results:
WER
Work in progress...
Installation
Last stable version
pip install rtasr
From source
git clone https://github.com/Wordcab/rtasr
cd rtasr
pip install .
Commands
The CLI is available through the rtasr
command.
rtasr --help
List datasets, metrics and providers
# List everything
rtasr list
# List only datasets
rtasr list -t datasets
# List only metrics
rtasr list -t metrics
# List only providers
rtasr list -t providers
Datasets download
Available datasets are:
ami
: AMI Corpusvoxconverse
: VoxConverse
rtasr download -d <dataset>
ASR Transcription
Providers
Implemented ASR providers are:
-
assemblyai
: AssemblyAI -
aws
: AWS Transcribe -
azure
: Azure Speech -
deepgram
: Deepgram -
google
: Google Cloud Speech-to-Text -
revai
: RevAI -
speechmatics
: Speechmatics -
wordcab
: Wordcab
Run transcription
Run ASR transcription on a given dataset with a given provider.
rtasr transcription -d <dataset> -p <provider>
Multiple providers
You can specify as many providers as you want:
rtasr transcription -d <dataset> -p <provider1> <provider2> <provider3> ...
Choose dataset split
You can specify the dataset split to use:
rtasr transcription -d <dataset> -p <provider> -s <split>
If not specified, all the available splits will be used.
Caching
By default, the transcription results are cached in the ~/.cache/rtasr/transcription
directory for each provider.
If you don't want to use the cache, use the --no-cache
flag.
rtasr transcription -d <dataset> -p <provider> --no-cache
Note: the cache is used to avoid running the same file twice. By removing the cache, you will run the transcription on the whole dataset again. We aren't responsible for any extra costs.
Debug mode
Use the --debug
flag to run only one file by split for each provider.
rtasr transcription -d <dataset> -p <provider> --debug
Evaluation (🚧 WIP)
The evaluation
command allows you to run an evaluation on the transcription results.
Run DER evaluation
Specify the dataset to use:
rtasr evaluation -m der -d <dataset> -s <split>
Contributing
Be sure to have hatch installed.
Quality
- Run quality checks:
hatch run quality:check
- Run quality formatting:
hatch run quality:format
Testing
- Run tests:
hatch run tests:run
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.