🏆 Run benchmarks against the most common ASR tools on the market.
Project description
Rate That ASR (RTASR)
🏆 Run benchmarks against the most common ASR tools on the market.
Results
[!IMPORTANT]
Deepgram benchmark results have been updated with the latest Nova 2 model.
WER & WRR
wer
= Word Error Rate, mer
= Match Error Rate, wil
= Word Information Lost, wrr
= Word Recognition Rate
- Dataset: Fleurs
DER
der
= Diarization Error Rate, miss
= missed detection, confusion
= incorrect detection, fa
= false alarm
- Dataset: VoxConverse
[!NOTE]
Click on the images to get a bigger display.
- Dataset: AMI Corpus
[!NOTE]
Click on the images to get a bigger display.
Installation
Last stable version
pip install rtasr
From source
git clone https://github.com/Wordcab/rtasr
cd rtasr
pip install .
Commands
The CLI is available through the rtasr
command.
rtasr --help
List datasets, metrics and providers
# List everything
rtasr list
# List only datasets
rtasr list -t datasets
# List only metrics
rtasr list -t metrics
# List only providers
rtasr list -t providers
Datasets download
Available datasets are:
ami
: AMI Corpusvoxconverse
: VoxConverse
rtasr download -d <dataset>
ASR Transcription
Providers
Implemented ASR providers are:
-
assemblyai
: AssemblyAI -
aws
: AWS Transcribe -
azure
: Azure Speech -
deepgram
: Deepgram -
google
: Google Cloud Speech-to-Text -
revai
: RevAI -
speechmatics
: Speechmatics -
wordcab
: Wordcab
Run transcription
Run ASR transcription on a given dataset with a given provider.
rtasr transcription -d <dataset> -p <provider>
Multiple providers
You can specify as many providers as you want:
rtasr transcription -d <dataset> -p <provider1> <provider2> <provider3> ...
Choose dataset split
You can specify the dataset split to use:
rtasr transcription -d <dataset> -p <provider> -s <split>
If not specified, all the available splits will be used.
Caching
By default, the transcription results are cached in the ~/.cache/rtasr/transcription
directory for each provider.
If you don't want to use the cache, use the --no-cache
flag.
rtasr transcription -d <dataset> -p <provider> --no-cache
Note: the cache is used to avoid running the same file twice. By removing the cache, you will run the transcription on the whole dataset again. We aren't responsible for any extra costs.
Debug mode
Use the --debug
flag to run only one file by split for each provider.
rtasr transcription -d <dataset> -p <provider> --debug
Evaluation
The evaluation
command allows you to run an evaluation on the transcription results.
If you don't specify the split, the evaluation will be run on the whole dataset.
Run DER evaluation
rtasr evaluation -m der -d <dataset> -s <split>
Run WER evaluation
rtasr evaluation -m wer -d <dataset> -s <split>
Plot results
To get the plots of the evaluation results, use the plot
command.
If you don't specify the split, the plots will be generated for all the available splits.
Plot DER results
rtasr plot -m der -d <dataset> -s <split>
Plot WER results
rtasr plot -m wer -d <dataset> -s <split>
Dataset length
To get the total length of a dataset, use the audio-length
command.
This command allow you to get the number of minutes of audio for each split of a dataset.
If you don't specify the split, the total length of the dataset will be returned for all the available splits.
rtasr audio-length -d <dataset> -s <split>
Contributing
Be sure to have hatch installed.
Quality
- Run quality checks:
hatch run quality:check
- Run quality formatting:
hatch run quality:format
Testing
- Run tests:
hatch run tests:run
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file rtasr-0.0.7.tar.gz
.
File metadata
- Download URL: rtasr-0.0.7.tar.gz
- Upload date:
- Size: 858.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.25.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 73baebf745ba88ff2f1f254d06860f48fe7017f858dae1c190b78ea07193202f |
|
MD5 | 3a89addf5e52887c54d37d84f3c93f85 |
|
BLAKE2b-256 | 7d85a4071ec7f962d9c487cdb0c2c7e4632f1e807aa206746efda24573d2fde5 |
File details
Details for the file rtasr-0.0.7-py3-none-any.whl
.
File metadata
- Download URL: rtasr-0.0.7-py3-none-any.whl
- Upload date:
- Size: 56.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.25.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c4f73d45178b1bd0de4c37c1dfe83790fe689355f84ecb2b9b84a418248ae376 |
|
MD5 | ef40a98309a1a69972a883278239a1ea |
|
BLAKE2b-256 | a6735bb332c2e4ee4d53645a3deca5a08c9cb4bfedcb6afa717a08efd55da4a3 |