Skip to main content

A study to benchmark whisper based ASRs in Malayalam

Project description

malayalam_asr_benchmarking

Install

pip install malayalam_asr_benchmarking

Or locally

pip install -e .

Setting up your development environment

I am developing this project with nbdev. Please take some time reading up on nbdev … how it works, directives, etc… by checking out the walk-thrus and tutorials on the nbdev website

Step 1: Install Quarto:

nbdev_install_quarto

Other options are mentioned in getting started to quarto

Step 2: Install hooks

nbdev_install_hooks

Step 3: Install our library

pip install -e '.[dev]'

How to use

Fill me in please! Don’t forget code examples:

from malayalam_asr_benchmarking.commonvoice import evaluate_whisper_model_common_voice
evaluate_whisper_model_common_voice("parambharat/whisper-tiny-ml")
Found cached dataset common_voice_11_0 (/home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0)
Loading cached processed dataset at /home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/cache-374585c2877047e3.arrow
Loading cached processed dataset at /home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/cache-22670505c562e0d4.arrow
/opt/conda/lib/python3.8/site-packages/transformers/generation_utils.py:1359: UserWarning: Neither `max_length` nor `max_new_tokens` has been set, `max_length` will default to 448 (`self.config.max_length`). Controlling `max_length` via the config is deprecated and `max_length` will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
  warnings.warn(

Total time taken: 117.81971025466919
The WER of model: 38.31
The CER of model: 21.93
The model size is: 37.76M

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

malayalam_asr_benchmarking-0.0.1.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file malayalam_asr_benchmarking-0.0.1.tar.gz.

File metadata

File hashes

Hashes for malayalam_asr_benchmarking-0.0.1.tar.gz
Algorithm Hash digest
SHA256 3f131f597dcbb90df30d6b3b27ba7bc5dd4480e15eaa10cb0c7ce5f818ccea9f
MD5 b51cc25c05e50f3bf6b63ff103c25683
BLAKE2b-256 d8d6b490ec987f7495e31c8167abf5291e7b6880b26b8ee171421a8297ba6567

See more details on using hashes here.

File details

Details for the file malayalam_asr_benchmarking-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for malayalam_asr_benchmarking-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3d3cdad6809766c77a69824208e3ef8852ba7c49e6b8197dd0a6b7d1b7633132
MD5 a463ad460e2315caf93d64c5e9f10420
BLAKE2b-256 3c149e47c1a32ef542d278554b7724f14e7eee934f152e133ef56c33911b229a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page