Evaluation as a Service for Natural Language Processing
Project description
EaaS_API
Documentation
Documentation at https://expressai.github.io/autoeval/. Some references for writing docs can refer to
- https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#rst-primer
- https://sphinx-tutorial.readthedocs.io/step-1/
- https://sphinx-themes.org/sample-sites/furo/
Usage
To install the API, simply run
pip install eaas
To use the API, You should go through the following two steps.
- Step 1: You should load the default configurations and make modifications based on your own needs.
from eaas import Config
config = Config()
# To see the metrics we support, run
print(config.metrics())
# dict_keys(['bart_score_summ', 'bart_score_mt', 'bart_score_cnn_hypo_ref', 'bert_score', 'bleu', 'chrf', 'comet', 'comet_qe', 'mover_score', 'prism', 'prism_qe', 'rouge1', 'rouge2', 'rougeL'])
# To see the default configuration of a metric, run
print(config.bleu.to_dict())
# {'smooth_method': 'exp', 'smooth_value': None, 'force': False, 'lowercase': False, 'use_effective_order': False}
# To modify the config, run
config.bleu.set_property("smooth_method", "floor")
print(config.bleu.to_dict())
# {'smooth_method': 'floor', 'smooth_value': None, 'force': False, 'lowercase': False, 'use_effective_order': False}
- Step 2: Initialize the client and send your inputs.
from eaas import Client
client = Client()
client.load_config(config) # The config you have created above
# To use this API for scoring, you need to format your input as list of dictionary.
# Each dictionary consists of `source` (string, optional), `references` (list of string, optional)
# and `hypothesis` (string, required). `source` and `references` are optional based on the metrics
# you want to use. Please do not conduct any preprocessing on `source`, `references` or `hypothesis`,
# we expect normal-cased detokenized texts. All the preprocessing steps are taken by the metrics.
# Below is a simple example.
inputs = [{"source": "This is the source.",
"references": ["This is the reference one.", "This is the reference two."],
"hypothesis": "This is the generated hypothesis."}]
metrics = ["bleu", "chrf"] # Can be None for simplicity if you consider using all metrics
score_dic = client.score(inputs, task="sum", metrics=metrics, lang="en")
# inputs is a list of Dict, task is the name of task, metrics is metric list, lang is the two-letter code language
The output is like
# sample_level is a list of dict, corpus_level is a dict
{
'sample_level': [
{'bleu': 32.46679154750991,
'attr_compression': 0.8333333333333334,
'attr_copy_len': 2.0,
'attr_coverage': 0.6666666666666666,
'attr_density': 1.6666666666666667,
'attr_hypothesis_len': 6,
'attr_novelty': 0.6,
'attr_repetition': 0.0,
'attr_source_len': 5,
'chrf': 38.56890099861521}
],
'corpus_level': {
'corpus_bleu': 32.46679154750991,
'corpus_attr_compression': 0.8333333333333334,
'corpus_attr_copy_len': 2.0,
'corpus_attr_coverage': 0.6666666666666666,
'corpus_attr_density': 1.6666666666666667,
'corpus_attr_hypothesis_len': 6.0,
'corpus_attr_novelty': 0.6,
'corpus_attr_repetition': 0.0,
'corpus_attr_source_len': 5.0,
'corpus_chrf': 38.56890099861521
}
}
Supported Metrics
Currently, EaaS supports the following metrics:
bart_score_cnn_hypo_ref
bart_score_summ
bart_score_mt
bert_score_p
bert_score_r
bert_score_f
bleu
chrf
comet
comet_qe
mover_score
prism
prism_qe
rouge1
rouge2
rougeL
Support for Common metrics
We support quick calculation for BLEU and ROUGE(1,2,L), see the following for usage.
from eaas import Config, Client
config = Config()
client = Client()
client.load_config(config)
references = [["This is the reference one for sample one.", "This is the reference two for sample one."],
["This is the reference one for sample two.", "This is the reference two for sample two."]]
hypothesis = ["This is the generated hypothesis for sample one.",
"This is the generated hypothesis for sample two."]
# Calculate BLEU
client.bleu(references, hypothesis, lang="en")
# Calculate ROUGEs
client.rouge1(references, hypothesis, lang="en")
client.rouge2(references, hypothesis, lang="en")
client.rougeL(references, hypothesis, lang="en")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eaas-0.3.4.tar.gz.
File metadata
- Download URL: eaas-0.3.4.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43e94efacd2d57d3d73740257af7a0895283b736cc28c76e4c79ac4a1543f8b4
|
|
| MD5 |
267e386be6c587585ef4b7df0a792559
|
|
| BLAKE2b-256 |
ec508ab62eb9b7e484443e329e2fad555637bebad42d108f224fa178bf0de2c0
|
File details
Details for the file eaas-0.3.4-py2.py3-none-any.whl.
File metadata
- Download URL: eaas-0.3.4-py2.py3-none-any.whl
- Upload date:
- Size: 8.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b998028024bd844b940fe3cc3e91d466bf62f13f2195ca96f56bbc02a1a4fb3f
|
|
| MD5 |
f6ea1ad199010f3d7fa480a5a4621f10
|
|
| BLAKE2b-256 |
372528e89db86929026e39cc77d7255756306a09a529536f61ac341aab90ede1
|