Evaluation toolkit for neural language generation.
Project description
Jury
Simple tool/toolkit for evaluating NLG (Natural Language Generation) offering various automated metrics. Jury offers a smooth and easy-to-use interface. It uses huggingface/datasets package for underlying metric computation, and hence adding custom metric is easy as adopting datasets.Metric
.
Installation
Through pip,
pip install jury
or build from source,
git clone https://github.com/obss/jury.git
cd jury
python setup.py install
Usage
API Usage
It is only two lines of code to evaluate generated outputs.
from jury import Jury
jury = Jury()
# Microsoft translator translation for "Yurtta sulh, cihanda sulh." (16.07.2021)
predictions = ["Peace in the dormitory, peace in the world."]
references = ["Peace at home, peace in the world."]
scores = jury.evaluate(predictions, references)
Specify metrics you want to use on instantiation.
jury = Jury(metrics=["bleu", "meteor"])
scores = jury.evaluate(predictions, references)
Custom Metrics
You can use custom metrics with inheriting datasets.Metric
, you can see current metrics on datasets/metrics. The code snippet below gives a brief explanation.
import datasets
CustomMetric(datasets.Metric):
def _info(self):
pass
def _compute(self, predictions, references, *args, **kwargs):
pass
Contributing
PRs are welcomed as always :)
Installation
git clone https://github.com/obss/jury.git
cd jury
python setup.py develop
pip install -r requirements-dev.txt
Tests
To tests simply run.
python tests/run_tests.py
Code Style
To check code style,
python tests/run_code_style.py check
To format codebase,
python tests/run_code_style.py format
License
Licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.