lares

LARES: vaLidation, evAluation and REliability Solutions

These details have not been verified by PyPI

Project links

Homepage

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

LARES (vaLidation, evAluation and REliability Solutions) is a Python package designed to assist with the evaluation and validation models in various tasks such as translation, summarization, and rephrasing.

This package leverages a suite of existing tools and resources to provide the best form of evaluation and validation for the prompted task. Natural Language Toolkit (NLTK), BERT, and ROUGE are employed for evaluations, while Microsoft's Fairlearn, Facebook's BART, and roBERTa are used to assess and address the toxicity and fairness of a given model.

In addition, LARES uses datasets from HuggingFace, where the choice of datasets was informed by benchmark setters such as the General Language Understanding Evaluation (GLUE) benchmark.

Features

Quantitative and Qualitative Evaluation: Provides both qualitative and quantitative approaches to evaluating models. Quantitative metrics include METEOR scores for translations, normalized ROUGE scores for summarizations, and BERT scores for rephrasing tasks. Qualitative metrics are computed both from binary user judgements as well as sentiment analysis done on user feedback.
Fairness and Toxicity Validation: Provides a quantitative measure of the toxicity and fairness of a given model for specific tasks by leveraging Fairlearn and roBERTa.
Iterative Reconstruction: Iteratively rephrases model responses until below a specified toxicity and above a specified quality threshold using BART

Workflow

Installation

Requires Python 3.6 or later. You can install using pip via:

pip install lares

Usage

Here is a basic usage example for translation task:

import openai
from datasets import load_dataset
from lares import *

openai.api_key = '' # replace with your OpenAI API key
dataset = load_dataset("opus100", "en-fr")

for data in dataset["validation"]['translation'][100:110]:
    prompt = data["en"]
    reference = data["fr"]

    input_prompt = "Translate the following to french: "+prompt
    print(input_prompt)
    result = generate(input_prompt, reference, task_type='Translation')

    print(f"Prompt: {prompt}")
    print(f"Reference: {reference}")
    print(f"Generated Response: {result}\n")

Dependencies

openai==0.27.8
nltk==3.7
torch==2.0.1
transformers==4.31.0
rouge==1.0.1
bert_score==0.3.12
datasets==1.11.0

To be explicit, you can install via:

pip install openai==0.27.8 nltk==3.7 torch==2.0.1 transformers==4.31.0 rouge==1.0.1 bert_score==0.3.12 datasets==1.11.0

Project details

These details have not been verified by PyPI

Project links

Homepage

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.31

Aug 4, 2023

0.0.30

Aug 4, 2023

0.0.29

Aug 4, 2023

0.0.28

Aug 4, 2023

0.0.27

Aug 4, 2023

0.0.26

Aug 4, 2023

0.0.25

Aug 4, 2023

0.0.24

Aug 4, 2023

0.0.23

Aug 4, 2023

0.0.22

Aug 3, 2023

0.0.21

Aug 3, 2023

This version

0.0.20

Aug 3, 2023

0.0.19

Aug 3, 2023

0.0.18

Aug 3, 2023

0.0.17

Aug 3, 2023

0.0.16

Aug 3, 2023

0.0.15

Aug 3, 2023

0.0.14

Aug 3, 2023

0.0.13

Aug 3, 2023

0.0.12

Aug 3, 2023

0.0.11

Aug 3, 2023

0.0.10

Aug 3, 2023

0.0.9

Aug 3, 2023

0.0.8

Aug 3, 2023

0.0.7

Aug 3, 2023

0.0.6

Aug 3, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lares-0.0.20.tar.gz (6.0 kB view hashes)

Uploaded Aug 3, 2023 Source

Built Distribution

lares-0.0.20-py3-none-any.whl (6.7 kB view hashes)

Uploaded Aug 3, 2023 Python 3

Hashes for lares-0.0.20.tar.gz

Hashes for lares-0.0.20.tar.gz
Algorithm	Hash digest
SHA256	`4f034fcf5ef741ecd7abd82e36a5ad20e7e35514826af6db606f1c30deb22829`
MD5	`7fe95d5d710bdbc866927f4b99bfb12e`
BLAKE2b-256	`7e3f8f92dcdfb37236fef8c7995be6188da0134ef4b0f22cdaab3b297bb49b05`

Hashes for lares-0.0.20-py3-none-any.whl

Hashes for lares-0.0.20-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d4c5155de14249d03041e1f64db7f5c4d47ab52ed409873f9cb70f5aabb23d73`
MD5	`4b144c6170dabd8646eff3401154c9d1`
BLAKE2b-256	`f18e45f442c5dcacd5e74007698951448bdecc774dd7df1a309d00905911f1d7`