Text Generation Evaluation Toolkit
Project description
TiGEr-Eval: Text Generation Evaluation Toolkit
Overview
TiGEr toolkit for text generation evaluation.
Installation (from Pypi, recommended)
pip install tiger-eval
Installation (from source, unstable version)
pip install .
Done
-
Cross-lingual Consistency
-
Multichoice question evaluation
-
Open generation evaluation (with llama-2-7b-chat only)
-
BLEU score
TODO
The toolkit should support various metrics
-
ROUGE, BLEU
-
Model based: BERTScore, BLEURT
-
Open Generation, need to be further improved
-
For multichoice questions. Other matching techniques? E.g. use model to reformat the answer?
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tiger_eval-0.0.1.tar.gz
(18.0 kB
view hashes)
Built Distribution
tiger_eval-0.0.1-py3-none-any.whl
(16.6 kB
view hashes)
Close
Hashes for tiger_eval-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0c3bfb64769b302dfdb8258fcc3a5db44cf638a135856e93d6789c588c2a6a7d |
|
MD5 | f561ea649dd009b7be203efac460a461 |
|
BLAKE2b-256 | aed441b3e21d2380cbefb26d54eea3a52d8c79f2694d2a9670a4edf4f9f78e20 |