🏆 Run benchmarks against the most common ASR tools on the market.
Project description
Rate That ASR (RTASR)
🏆 Run benchmarks against the most common ASR tools on the market.
Results
[!IMPORTANT]
Deepgram benchmark results have been updated with the latest Nova 2 model.
WER & WRR
wer = Word Error Rate, mer = Match Error Rate, wil = Word Information Lost, wrr = Word Recognition Rate
- Dataset: Fleurs
DER
der = Diarization Error Rate, miss = missed detection, confusion = incorrect detection, fa = false alarm
- Dataset: VoxConverse
[!NOTE]
Click on the images to get a bigger display.
- Dataset: AMI Corpus
[!NOTE]
Click on the images to get a bigger display.
Installation
Last stable version
pip install rtasr
From source
git clone https://github.com/Wordcab/rtasr
cd rtasr
pip install .
Commands
The CLI is available through the rtasr command.
rtasr --help
List datasets, metrics and providers
# List everything
rtasr list
# List only datasets
rtasr list -t datasets
# List only metrics
rtasr list -t metrics
# List only providers
rtasr list -t providers
Datasets download
Available datasets are:
ami: AMI Corpusvoxconverse: VoxConverse
rtasr download -d <dataset>
ASR Transcription
Providers
Implemented ASR providers are:
-
assemblyai: AssemblyAI -
aws: AWS Transcribe -
azure: Azure Speech -
deepgram: Deepgram -
google: Google Cloud Speech-to-Text -
revai: RevAI -
speechmatics: Speechmatics -
wordcab: Wordcab
Run transcription
Run ASR transcription on a given dataset with a given provider.
rtasr transcription -d <dataset> -p <provider>
Multiple providers
You can specify as many providers as you want:
rtasr transcription -d <dataset> -p <provider1> <provider2> <provider3> ...
Choose dataset split
You can specify the dataset split to use:
rtasr transcription -d <dataset> -p <provider> -s <split>
If not specified, all the available splits will be used.
Caching
By default, the transcription results are cached in the ~/.cache/rtasr/transcription directory for each provider.
If you don't want to use the cache, use the --no-cache flag.
rtasr transcription -d <dataset> -p <provider> --no-cache
Note: the cache is used to avoid running the same file twice. By removing the cache, you will run the transcription on the whole dataset again. We aren't responsible for any extra costs.
Debug mode
Use the --debug flag to run only one file by split for each provider.
rtasr transcription -d <dataset> -p <provider> --debug
Evaluation
The evaluation command allows you to run an evaluation on the transcription results.
If you don't specify the split, the evaluation will be run on the whole dataset.
Run DER evaluation
rtasr evaluation -m der -d <dataset> -s <split>
Run WER evaluation
rtasr evaluation -m wer -d <dataset> -s <split>
Plot results
To get the plots of the evaluation results, use the plot command.
If you don't specify the split, the plots will be generated for all the available splits.
Plot DER results
rtasr plot -m der -d <dataset> -s <split>
Plot WER results
rtasr plot -m wer -d <dataset> -s <split>
Dataset length
To get the total length of a dataset, use the audio-length command.
This command allow you to get the number of minutes of audio for each split of a dataset.
If you don't specify the split, the total length of the dataset will be returned for all the available splits.
rtasr audio-length -d <dataset> -s <split>
Contributing
Be sure to have hatch installed.
Quality
- Run quality checks:
hatch run quality:check - Run quality formatting:
hatch run quality:format
Testing
- Run tests:
hatch run tests:run
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rtasr-0.0.7.tar.gz.
File metadata
- Download URL: rtasr-0.0.7.tar.gz
- Upload date:
- Size: 858.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.25.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73baebf745ba88ff2f1f254d06860f48fe7017f858dae1c190b78ea07193202f
|
|
| MD5 |
3a89addf5e52887c54d37d84f3c93f85
|
|
| BLAKE2b-256 |
7d85a4071ec7f962d9c487cdb0c2c7e4632f1e807aa206746efda24573d2fde5
|
File details
Details for the file rtasr-0.0.7-py3-none-any.whl.
File metadata
- Download URL: rtasr-0.0.7-py3-none-any.whl
- Upload date:
- Size: 56.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.25.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4f73d45178b1bd0de4c37c1dfe83790fe689355f84ecb2b9b84a418248ae376
|
|
| MD5 |
ef40a98309a1a69972a883278239a1ea
|
|
| BLAKE2b-256 |
a6735bb332c2e4ee4d53645a3deca5a08c9cb4bfedcb6afa717a08efd55da4a3
|