Skip to main content

An evaluation framework for Serbian Whisper models.

Project description

AiDA Whisper Evaluation Framework (Serbian)

An evaluation framework for Serbian Whisper models.

Whisper Evaluator 🎤

A simple, modular framework to evaluate fine-tuned Whisper models in Python notebooks.

This library allows you to easily run evaluations on any dataset from the Hugging Face Hub using a simple configuration dictionary. It calculates a comprehensive set of metrics, including WER, CER, BLEU, and ROUGE, and automatically logs all results to a file.

Installation

You can install the library directly from GitHub for latest updates and features. Make sure you have git installed on your system.

pip install git+[https://github.com/datatab/whisper-evaluator.git](https://github.com/datatab/whisper-evaluator.git)

Quickstart

Using the library in a Google Colab or Jupyter Notebook is straightforward.

from whisper_evaluator import Evaluator
import json

# 1. Define your evaluation configuration
config = {
    "model_args": {
        "name_or_path": "openai/whisper-large-v2", # Your fine-tuned model ID
        "device": "cuda"
    },
    "task_args": {
        "dataset_name": "mozilla-foundation/common_voice_11_0",
        "dataset_subset": "sr", # Serbian language
        "dataset_split": "test[:20]", # Use the first 20 samples for a quick demo
        "audio_column": "audio",
        "text_column": "sentence"
    }
}

# 2. Initialize the evaluator
evaluator = Evaluator(config=config)

# 3. Run the evaluation (logs to 'evaluation_log.txt' by default)
detailed_results, metrics = evaluator.run()

# 4. Analyze the results
print("\n--- Final Metrics ---")
# Pretty print the metrics dictionary
print(json.dumps(metrics, indent=2))

print("\n--- Sample of evaluation details ---")
# Print the first 3 results from the list
for i, result in enumerate(detailed_results[:3]):
    print(f"\n--- Example {i+1} ---")
    print(f"Reference:  {result['reference']}")
    print(f"Prediction: {result['prediction']}")

Project Setup

Follow these steps to set up the AiDA-Whisper-Eval project.


Using Conda

1. Create a new Conda environment

conda create --name aida python=3.12 -y
conda activate aida

2. Install Poetry

pip install poetry

3. Install project dependencies

Navigate to the project's root directory and run:

poetry install

Using Plain Python

1. Create and activate a virtual environment

python -m venv venv

# On Linux/macOS
source venv/bin/activate

# On Windows
.\venv\Scripts\activate

2. Upgrade pip and install Poetry

pip install --upgrade pip
pip install poetry

3. Install project dependencies

From the project's root directory, run:

poetry install
pip install pre-commit

4. Set up pre-commit hooks

poetry run pre-commit install

Verifying Installation

Check installation by running tests:

# On Linux/macOS
make test

# On Windows
poetry run pytest

Your setup is complete!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whisper_eval_serbian-0.0.31.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whisper_eval_serbian-0.0.31-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file whisper_eval_serbian-0.0.31.tar.gz.

File metadata

  • Download URL: whisper_eval_serbian-0.0.31.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.12.11 Linux/6.14.0-27-generic

File hashes

Hashes for whisper_eval_serbian-0.0.31.tar.gz
Algorithm Hash digest
SHA256 f11c59840628a0193797bc3d5225ed7c2a95a5c964407fc440a0f88efd25a38c
MD5 5e1e446248c6887e9b1f9c661a9051d5
BLAKE2b-256 034f3fe8701dd2e98f032f303c011e8f5fa8995ff572c52e61fdb070922a81f6

See more details on using hashes here.

File details

Details for the file whisper_eval_serbian-0.0.31-py3-none-any.whl.

File metadata

  • Download URL: whisper_eval_serbian-0.0.31-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.12.11 Linux/6.14.0-27-generic

File hashes

Hashes for whisper_eval_serbian-0.0.31-py3-none-any.whl
Algorithm Hash digest
SHA256 38135c3dab758c8cc955976ddc8e8e6b858241f1a4c7ef0af846e9bdb62eca65
MD5 2a54fd84090a9dd0dbb796dea1a2a632
BLAKE2b-256 9a6b2f283a62a924374b8a06a2ac37f3009d940baf7ad6f4e95968006f659e47

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page