An evaluation framework for Serbian Whisper models.
Project description
AiDA Whisper Evaluation Framework (Serbian)
An evaluation framework for Serbian Whisper models.
Whisper Evaluator 🎤
A simple, modular framework to evaluate fine-tuned Whisper models in Python notebooks.
This library allows you to easily run evaluations on any dataset from the Hugging Face Hub using a simple configuration dictionary. It calculates a comprehensive set of metrics, including WER, CER, BLEU, and ROUGE, and automatically logs all results to a file.
Installation
You can install the library directly from GitHub:
pip install git+[https://github.com/your-username/whisper-evaluator.git](https://github.com/your-username/whisper-evaluator.git)
Quickstart
Using the library in a Google Colab or Jupyter Notebook is straightforward.
from whisper_evaluator import Evaluator
import json
# 1. Define your evaluation configuration
config = {
"model_args": {
"name_or_path": "openai/whisper-large-v2", # Your fine-tuned model ID
"device": "cuda"
},
"task_args": {
"dataset_name": "mozilla-foundation/common_voice_11_0",
"dataset_subset": "sr", # Serbian language
"dataset_split": "test[:20]", # Use the first 20 samples for a quick demo
"audio_column": "audio",
"text_column": "sentence"
}
}
# 2. Initialize the evaluator
evaluator = Evaluator(config=config)
# 3. Run the evaluation (logs to 'evaluation_log.txt' by default)
detailed_results, metrics = evaluator.run()
# 4. Analyze the results
print("\n--- Final Metrics ---")
# Pretty print the metrics dictionary
print(json.dumps(metrics, indent=2))
print("\n--- Sample of evaluation details ---")
# Print the first 3 results from the list
for i, result in enumerate(detailed_results[:3]):
print(f"\n--- Example {i+1} ---")
print(f"Reference: {result['reference']}")
print(f"Prediction: {result['prediction']}")
Project Setup
Follow these steps to set up the AiDA-Whisper-Eval project.
Using Conda
1. Create a new Conda environment
conda create --name aida python=3.12 -y
conda activate aida
2. Install Poetry
pip install poetry
3. Install project dependencies
Navigate to the project's root directory and run:
poetry install
Using Plain Python
1. Create and activate a virtual environment
python -m venv venv
# On Linux/macOS
source venv/bin/activate
# On Windows
.\venv\Scripts\activate
2. Upgrade pip and install Poetry
pip install --upgrade pip
pip install poetry
3. Install project dependencies
From the project's root directory, run:
poetry install
pip install pre-commit
4. Set up pre-commit hooks
poetry run pre-commit install
Verifying Installation
Check installation by running tests:
# On Linux/macOS
make test
# On Windows
poetry run pytest
Your setup is complete!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file whisper_eval_serbian-0.0.16.tar.gz.
File metadata
- Download URL: whisper_eval_serbian-0.0.16.tar.gz
- Upload date:
- Size: 14.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.12.11 Linux/6.14.0-27-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a18d3af7fa63f94121acb98278a35c5a9ff29706d3551bd2c32af6ae77abf601
|
|
| MD5 |
35638e1ba8a7fb03e59272f21fe7b3e1
|
|
| BLAKE2b-256 |
1d080a1f85b06f68132076c03e8bcb12f317987bfe9465dc7c299b9f9a999c62
|
File details
Details for the file whisper_eval_serbian-0.0.16-py3-none-any.whl.
File metadata
- Download URL: whisper_eval_serbian-0.0.16-py3-none-any.whl
- Upload date:
- Size: 14.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.12.11 Linux/6.14.0-27-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
442cc12cb715e8ba584f2fd63480738bb5a3ed32b3de90e499f83d5add5c38fb
|
|
| MD5 |
754096fda82c36f8fed8c42ae2bb26c7
|
|
| BLAKE2b-256 |
6acf89922b317f62adc21212eff652c7a15c81ae96e248a9de860ebe2b160fbb
|