Robustness Gym is an evaluation toolkit for machine learning.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Robustness Gym

GitHub Workflow Status GitHub

Robustness Gym is a Python evaluation toolkit for machine learning models.

Getting Started | What is Robustness Gym? | Docs | Contributing | About

Getting started

pip install robustnessgym

Note: some parts of Robustness Gym rely on optional dependencies. If you know which optional dependencies you'd like to install, you can do so using something like pip install robustnessgym[dev,text] instead. See setup.py for a full list of optional dependencies.

What is Robustness Gym?

Robustness Gym is being developed to address challenges in evaluating machine learning models today, with tools to evaluate and visualize the quality of machine learning models.

Along with Meerkat, we make it easy for you to load in any kind of data (text, images, videos, time-series) and quickly evaluate how well your models are performing.

Load data into a Meerkat `DataPanel`

from robustnessgym import DataPanel

# Any Huggingface dataset
dp = DataPanel.load_huggingface('boolq')

# Custom datasets
dp = DataPanel.from_csv(...)
dp = DataPanel.from_pandas(...)
dp = DataPanel.from_jsonl(...)
dp = DataPanel.from_feather(...)

# Coming soon: any WILDS dataset
# from meerkat.contrib.wilds import get_wilds_datapanel
# dp = get_wilds_datapanel("fmow", root_dir="/datasets/", split="test")

Run common workflows

Spacy

from robustnessgym import DataPanel, lookup
from robustnessgym.ops import SpacyOp

dp = DataPanel.load_huggingface('boolq')

# Run the Spacy pipeline on the 'question' column of the dataset
spacy = SpacyOp()
dp = spacy(dp=dp, columns=['question'])
# adds a new column that is auto-named
# "SpacyOp(lang=en_core_web_sm, neuralcoref=False, columns=['passage'])"

# Grab the Spacy column from the DataPanel using the lookup
spacy_column = lookup(dp, spacy, ['question'])

Stanza

from robustnessgym import DataPanel, lookup
from robustnessgym.ops import StanzaOp

dp = DataPanel.load_huggingface('boolq')

# Run the Stanza pipeline on the 'question' column of the dataset
stanza = StanzaOp()
dp = stanza(dp=dp, columns=['question'])
# adds a new column that is auto-named "StanzaOp(columns=['question'])"

# Grab the Stanza column from the DataPanel using the lookup
stanza_column = lookup(dp, stanza, ['question'])

Custom Operation (Single Output)

# Or, create your own Operation
from robustnessgym import DataPanel, Operation, Id, lookup

dp = DataPanel.load_huggingface('boolq')

# A function that capitalizes text
def capitalize(batch: DataPanel, columns: list):
    return [text.capitalize() for text in batch[columns[0]]]

# Wrap in an Operation: `process_batch_fn` accepts functions that have
# exactly 2 arguments: batch and columns, and returns a tuple of outputs
op = Operation(
    identifier=Id('CapitalizeOp'),
    process_batch_fn=capitalize,
)

# Apply to a DataPanel
dp = op(dp=dp, columns=['question'])

# Look it up when you need it
capitalized_text = lookup(dp, op, ['question'])

Custom Operation (Multiple Outputs)

from robustnessgym import DataPanel, Operation, Id, lookup

dp = DataPanel.load_huggingface('boolq')

# A function that capitalizes and upper-cases text: this will
# be used to add two columns to the DataPanel
def capitalize_and_upper(batch: DataPanel, columns: list):
    return [text.capitalize() for text in batch[columns[0]]], \
           [text.upper() for text in batch[columns[0]]]

# Wrap in an Operation: `process_batch_fn` accepts functions that have
# exactly 2 arguments: batch and columns, and returns a tuple of outputs
op = Operation(
    identifier=Id('ProcessingOp'),
    output_names=['capitalize', 'upper'],  # tell the Operation the name of the two outputs
    process_batch_fn=capitalize_and_upper,
)

# Apply to a DataPanel
dp = op(dp=dp, columns=['question'])

# Look them up when you need them
capitalized_text = lookup(dp, op, ['question'], 'capitalize')
upper_text = lookup(dp, op, ['question'], 'upper')

Create Evaluations

Out-of-the-box Subpopulations

from robustnessgym import DataPanel
from robustnessgym import LexicalOverlapSubpopulation

dp = DataPanel.load_huggingface('boolq')

# Create a subpopulation that buckets examples based on length
lexo_sp = LexicalOverlapSubpopulation(intervals=[(0., 0.1), (0.1, 0.2)])

slices, membership = lexo_sp(dp=dp, columns=['question'])
# `slices` is a list of 2 DataPanel objects
# `membership` is a matrix of shape (n x 2)

Custom Subpopulation

from robustnessgym import DataPanel, ScoreSubpopulation, lookup
from robustnessgym.ops import SpacyOp

dp = DataPanel.load_huggingface('boolq')

def length(batch: DataPanel, columns: list):
    try:
        # Take advantage of previously stored Spacy information
        return [len(doc) for doc in lookup(batch, SpacyOp, columns)] 
    except AttributeError:
        # If unavailable, fall back to splitting text
        return [len(text.split()) for text in batch[columns[0]]]
    
# Create a subpopulation that buckets examples based on length
length_sp = ScoreSubpopulation(intervals=[(0, 10), (10, 20)], score_fn=length)

slices, membership = length_sp(dp=dp, columns=['question'])
# `slices` is a list of 2 DataPanel objects
# `membership` is a matrix of shape (n x 2)

About

You can read more about the ideas underlying Robustness Gym in our paper on arXiv.

The Robustness Gym project began as a collaboration between Stanford Hazy Research, Salesforce Research and UNC Chapel-Hill. We also have a website.

If you use Robustness Gym in your work, please use the following BibTeX entry,

@inproceedings{goel-etal-2021-robustness,
    title = "Robustness Gym: Unifying the {NLP} Evaluation Landscape",
    author = "Goel, Karan  and
      Rajani, Nazneen Fatema  and
      Vig, Jesse  and
      Taschdjian, Zachary  and
      Bansal, Mohit  and
      R{\'e}, Christopher",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2021.naacl-demos.6",
    pages = "42--55",
}

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.1.3

Aug 26, 2021

This version

0.1.2

Aug 20, 2021

0.1.1

Aug 19, 2021

0.1.0

Jul 27, 2021

0.0.4a2 pre-release

Apr 23, 2021

0.0.4a1 pre-release

Apr 8, 2021

0.0.4a0 pre-release

Mar 25, 2021

0.0.3

Mar 2, 2021

0.0.2

Jan 13, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

robustnessgym-0.1.2.tar.gz (85.7 kB view hashes)

Uploaded Aug 20, 2021 Source

Built Distribution

robustnessgym-0.1.2-py2.py3-none-any.whl (106.9 kB view hashes)

Uploaded Aug 20, 2021 Python 2 Python 3

Hashes for robustnessgym-0.1.2.tar.gz

Hashes for robustnessgym-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`d889224bab9a24827731bb112e5344cbc2a6d4ebba61177240c54b07eacade55`
MD5	`55f836f9fddc52f3aadf19ff5d8cd3cb`
BLAKE2b-256	`c2893a6d0b9da0fdbb8892d54a1ce17f8a44659b3b07a36e90640cceba59df0a`

Hashes for robustnessgym-0.1.2-py2.py3-none-any.whl

Hashes for robustnessgym-0.1.2-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`41c976557220835e3435cc9dc064bc84e6635eb7e5955acf3a4c3043fe4765bb`
MD5	`a170155b3e00e97363b621d036618500`
BLAKE2b-256	`2675a556312b82adac7077940276bf636966bc51c80c63925f20a1fcccb93966`

robustnessgym 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Robustness Gym

Getting started

What is Robustness Gym?

Load data into a Meerkat `DataPanel`

Run common workflows

Spacy

Stanza

Custom Operation (Single Output)

Custom Operation (Multiple Outputs)

Create Evaluations

Out-of-the-box Subpopulations

Custom Subpopulation

About

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

robustnessgym 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Robustness Gym

Getting started

What is Robustness Gym?

Load data into a Meerkat DataPanel

Run common workflows

Spacy

Stanza

Custom Operation (Single Output)

Custom Operation (Multiple Outputs)

Create Evaluations

Out-of-the-box Subpopulations

Custom Subpopulation

About

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

Load data into a Meerkat `DataPanel`