Skip to main content

Evaluation as a Service for Natural Language Processing

Project description

EaaS_API

Documentation

Documentation at https://expressai.github.io/autoeval/. Some references for writing docs can refer to

Usage

To install the API, simply run

pip install eaas

To use the API, You should go through the following two steps.

  • Step 1: You should load the default configurations and make modifications based on your own needs.
from eaas import Config
config = Config()
# To see the metrics we support, run
print(config.metrics())
# dict_keys(['bart_score_summ', 'bart_score_mt', 'bart_score_cnn_hypo_ref', 'bert_score', 'bleu', 'chrf', 'comet', 'comet_qe', 'mover_score', 'prism', 'prism_qe', 'rouge1', 'rouge2', 'rougeL'])

# To see the default configuration of a metric, run
print(config.bleu.to_dict())
# {'smooth_method': 'exp', 'smooth_value': None, 'force': False, 'lowercase': False, 'use_effective_order': False}

# To modify the config, run
config.bleu.set_property("smooth_method", "floor")
print(config.bleu.to_dict())
# {'smooth_method': 'floor', 'smooth_value': None, 'force': False, 'lowercase': False, 'use_effective_order': False}
  • Step 2: Initialize the client and send your inputs.
from eaas import Client
client = Client()
client.load_config(config)  # The config you have created above

# To use this API for scoring, you need to format your input as list of dictionary. 
# Each dictionary consists of `source` (string, optional), `references` (list of string, optional) 
# and `hypothesis` (string, required). `source` and `references` are optional based on the metrics 
# you want to use. Please do not conduct any preprocessing on `source`, `references` or `hypothesis`, 
# we expect normal-cased detokenized texts. All the preprocessing steps are taken by the metrics. 
# Below is a simple example.

inputs = [{"source": "This is the source.", 
           "references": ["This is the reference one.", "This is the reference two."],
           "hypothesis": "This is the generated hypothesis."}]
metrics = ["bleu", "chrf"] # Can be None for simplicity if you consider using all metrics

score_dic = client.score(inputs, task="sum", metrics=metrics, lang="en") 
# inputs is a list of Dict, task is the name of task, metrics is metric list, lang is the two-letter code language

The output is like

# sample_level is a list of dict, corpus_level is a dict
{
    'sample_level': [
        {'bleu': 32.46679154750991,
         'attr_compression': 1.2,
         'attr_copy_len': 2.0,
         'attr_coverage': 0.8,
         'attr_density': 2.0,
         'attr_hypothesis_len': 5,
         'attr_novelty': 0.5,
         'attr_repetition': 0.0,
         'attr_source_len': 6,
         'chrf': 38.56890099861521}
    ],
    'corpus_level': {
        'corpus_bleu': 32.46679154750991,
        'corpus_attr_compression': 1.2,
        'corpus_attr_copy_len': 2.0,
        'corpus_attr_coverage': 0.8,
        'corpus_attr_density': 2.0,
        'corpus_attr_hypothesis_len': 5.0,
        'corpus_attr_novelty': 0.5,
        'corpus_attr_repetition': 0.0,
        'corpus_attr_source_len': 6.0,
        'corpus_chrf': 38.56890099861521
    }
}

Support for Common metrics

We support quick calculation for BLEU and ROUGE(1,2,L), see the following for usage.

from eaas import Config, Client
config = Config()
client = Client()
client.load_config(config) 

references = [["This is the reference one for sample one.", "This is the reference two for sample one."],
              ["This is the reference one for sample two.", "This is the reference two for sample two."]]
hypothesis = ["This is the generated hypothesis for sample one.", 
              "This is the generated hypothesis for sample two."]

# Calculate BLEU
client.bleu(references, hypothesis, lang="en")

# Calculate ROUGEs
client.rouge1(references, hypothesis, lang="en")
client.rouge2(references, hypothesis, lang="en")
client.rougeL(references, hypothesis, lang="en")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eaas-0.3.0.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eaas-0.3.0-py2.py3-none-any.whl (8.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file eaas-0.3.0.tar.gz.

File metadata

  • Download URL: eaas-0.3.0.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for eaas-0.3.0.tar.gz
Algorithm Hash digest
SHA256 3785d66529842705445bd995180f35a3376463eac3e9d6cd46ab32a2a36ef32b
MD5 c56727ba8499bb8cee9913c9d357e76e
BLAKE2b-256 0738ace11ae3b554edc88bad88675092a937683a1e41028b5a08b7dc8bb451cd

See more details on using hashes here.

File details

Details for the file eaas-0.3.0-py2.py3-none-any.whl.

File metadata

  • Download URL: eaas-0.3.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for eaas-0.3.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 fb25a2bacd9c8ac1c4cebcac726c9901abf45a762cb4312d76794028dfe56800
MD5 d5ef42e8431da3382e9f033ab3011aa8
BLAKE2b-256 10756d52a968932a4e8a399659ea6b5af9c0824f8f109e19fecad109132cdecb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page