Explainable Leaderboards for Natural Language Processing
Project description
ExplainaBoard: An Explainable Leaderboard for NLP
Introduction | Web Tool | API Tool | Download | Paper | Video | Bib
Introduction
ExplainaBoard is an interpretable, interactive and reliable leaderboard with seven (so far) new features (F) compared with generic leaderboard.
- F1: Single-system Analysis: What is a system good or bad at?
- F2: Pairwise Analysis: Where is one system better (worse) than another?
- F3: Data Bias Analysis: What are the characteristics of different evaluated datasets?
- F5: Common errors: What are common mistakes that top-5 systems made?
- F6: Fine-grained errors: where will errors occur?
- F7: System Combination: Is there potential complementarity between different systems?
Usage
We not only provide a Web-based Interactive Toolkit but also release an API that users can flexible evaluate their systems offline, which means, you can play with ExplainaBoard at following levels:
- U1: Just playing with it: You can walk around, track NLP progress, understand relative merits of different top-performing systems.
- U2: We help you analyze your model: You submit your model outputs and deploy them into online ExplainaBoard
- U3: Do it by yourself: You can process your model outputs by yourself using our API.
API-based Toolkit: Quick Installation
Method 1: Simple installation from PyPI (Python 3 only)
pip install explainaboard
Method 2: Install from the source and develop locally (Python 3 only)
# Clone current repo
git clone https://github.com/neulab/ExplainaBoard.git
cd ExplainaBoard
# Requirements
pip install -r requirements.txt
# Install the package
python setup.py install
Then, you can run following examples via bash
Example for CLI
- text-classification:
explainaboard --task text-classification --system_outputs ./data/system_outputs/sst2/sst2-lstm.tsv
- named-entity-recognition:
explainaboard --task named-entity-recognition --system_outputs ./data/system_outputs/conll2003/conll2003.elmo
- extractive-qa:
explainaboard --task extractive-qa --system_outputs ./data/system_outputs/squad/testset-en.json
- summarization:
explainaboard --task summarization --system_outputs ./data/system_outputs/cnndm/cnndm_mini.bart
- text-pair-classification:
explainaboard --task text-pair-classification --system_outputs ./data/system_outputs/snli/snli.bert
- hellaswag
explainaboard --task hellaswag --system_outputs ./data/system_outputs/hellaswag/hellaswag.random
Example for Python SDK
from explainaboard import TaskType, get_loader, get_processor
path_data = "./explainaboard/tests/artifacts/test-summ.tsv"
loader = get_loader(TaskType.summarization, data = path_data)
data = loader.load()
processor = get_processor(TaskType.summarization, data = data)
analysis = processor.process()
analysis.write_to_directory("./")
Web-based Toolkit: Quick Learning
We deploy ExplainaBoard as a Web toolkit, which includes 9 NLP tasks, 40 datasets and 300 systems. Detailed information is as follows.
So far, ExplainaBoard covers following tasks
Task | Sub-task | Dataset | Model | Attribute |
---|---|---|---|---|
Sentiment | 8 | 40 | 2 | |
Text Classification | Topics | 4 | 18 | 2 |
Intention | 1 | 3 | 2 | |
Text-Span Classification | Aspect Sentiment | 4 | 20 | 4 |
Text pair Classification | NLI | 2 | 6 | 7 |
NER | 3 | 74 | 9 | |
Sequence Labeling | POS | 3 | 14 | 4 |
Chunking | 3 | 14 | 9 | |
CWS | 7 | 64 | 7 | |
Structure Prediction | Semantic Parsing | 4 | 12 | 4 |
Text Generation | Summarization | 2 | 36 | 7 |
Submit Your Results
You can submit your system's output by this form following the format description.
Download System Outputs
We haven't released datasets or corresponding system outputs that require licenses. But If you have licenses please fill in this form and we will send them to you privately. (Description of output's format can refer here If these system outputs are useful for you, you can cite our work.
Currently Covered Systems
So far, ExplainaBoard support more than 10 NLP tasks, including sequence classification, labeling, extraction and generation. Click here to see more.
Acknowledgement
We thanks all authors who share their system outputs with us: Ikuya Yamada, Stefan Schweter, Colin Raffel, Yang Liu, Li Dong. We also thank Vijay Viswanathan, Yiran Chen, Hiroaki Hayashi for useful discussion and feedback about ExplainaBoard.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file explainaboard-0.5.2.tar.gz
.
File metadata
- Download URL: explainaboard-0.5.2.tar.gz
- Upload date:
- Size: 521.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1753d78d824bc9791fa32c86ee04230ac8fde70b134d3608bfec7ee727b69e6b |
|
MD5 | 97ad423a666f87a93f437baec778cedb |
|
BLAKE2b-256 | 9d71498ea7685fdcfe8ef33ce9fbdd5a56a19c7eafbe2da7e3b2b7a44fac4363 |
File details
Details for the file explainaboard-0.5.2-py2.py3-none-any.whl
.
File metadata
- Download URL: explainaboard-0.5.2-py2.py3-none-any.whl
- Upload date:
- Size: 549.6 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d80aa5477cac74cd1a35a6d2c75c566faf38eee9e9191007193f9d2b35518f3 |
|
MD5 | 8afe7882e8fede225d49e9657bffb3e0 |
|
BLAKE2b-256 | 6a4ccc43923f8729589f93e7588945bdf824d14528bbf8fbe9d2b26ab9ed583e |