Text Generation Evaluation Toolkit
Project description
TiGEr-Eval: Text Generation Evaluation Toolkit
Overview
TiGEr toolkit for text generation evaluation.
Installation (from Pypi, recommended)
pip install tiger-eval
Installation (from source, unstable version)
pip install .
Done
-
Cross-lingual Consistency
-
Multichoice question evaluation
-
Open generation evaluation (with llama-2-7b-chat only)
-
BLEU score
TODO
The toolkit should support various metrics
-
ROUGE, BLEU
-
Model based: BERTScore, BLEURT
-
Open Generation, need to be further improved
-
For multichoice questions. Other matching techniques? E.g. use model to reformat the answer?
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tiger_eval-0.0.2.tar.gz
(18.4 kB
view hashes)
Built Distribution
tiger_eval-0.0.2-py3-none-any.whl
(17.1 kB
view hashes)
Close
Hashes for tiger_eval-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69b09fd352f3e5c44032fc966912675956ea14df954c27cdd84c39064362a6d8 |
|
MD5 | 09339da1bc8bb9e3357e28a4c1b4db03 |
|
BLAKE2b-256 | 0b5bbf9ef598d39c8f2d2500bd3f029dc629701fbb4f7098139bed414fd7ee16 |