No project description provided
Project description
Installation
To install the package simply write the following command in your favorite terminal:
$ pip install scandeval[all]
This will install all the model frameworks currently supported (pytorch
,
tensorflow
, jax
and spacy
). If you know you only need one of these, you
can install a slimmer package like so:
$ pip install scandeval[pytorch]
Lastly, if you are not interesting in benchmarking models, but just want to use the package to download datasets, then the following command will do the trick:
$ pip install scandeval
Quickstart
Benchmarking from the Command Line
The easiest way to benchmark models is via the command line interface. After having installed the package, you can benchmark your favorite model like so:
$ scandeval --model_id <model_id>
Here model_id
is the HuggingFace model ID, which can be found on the
HuggingFace Hub. By default this will
benchmark the model on all the datasets eligible. If you want to benchmark on a
specific dataset, this can be done via the --dataset
flag. This will for
instance evaluate the model on the AngryTweets
dataset:
$ scandeval --model_id <model_id> --dataset angry-tweets
We can also separate by language. To benchmark all Danish models, say, this can
be done using the language
tag, like so:
$ scandeval --language da
Multiple models, datasets and/or languages can be specified by just attaching multiple arguments. Here is an example with two models:
$ scandeval --model_id <model_id1> --model_id <model_id2> --dataset angry-tweets
See all the arguments and options available for the scandeval
command by
typing
$ scandeval --help
Benchmarking from a Script
In a script, the syntax is similar to the command line interface. You simply
initialise an object of the Benchmark
class, and call this benchmark object
with your favorite models and/or datasets:
>>> from scandeval import Benchmark
>>> benchmark = Benchmark()
>>> benchmark('<model_id>')
To benchmark on a specific dataset, you simply specify the second argument,
shown here with the AngryTweets
dataset again:
>>> benchmark('<model_id>', 'angry-tweets')
This would benchmark all Danish models:
>>> benchmark(language='da')
See the documentation for a more in-depth description.
Downloading Datasets
If you are just interested in downloading a dataset rather than benchmarking, this can be done as follows:
>>> from scandeval.datasets import load_angry_tweets
>>> X_train, X_test, y_train, y_test = load_angry_tweets()
Here X_train
and X_test
will be lists containing the relevant texts, and
y_train
and y_test
will be lists containing the associated labels.
See the documentation for a list of all the datasets that can be loaded.
Documentation
The full documentation can be found on ReadTheDocs.
Citing ScandEval
If you want to cite the framework then feel free to use this:
@article{nielsen2021scandeval,
title={ScandEval: Evaluation of language models on mono- or multilingual Scandinavian language tasks.},
author={Nielsen, Dan Saattrup},
journal={GitHub. Note: https://github.com/saattrupdan/ScandEval},
year={2021}
}
Remarks
The image used in the logo has been created by the amazing Scandinavia and the World team. Go check them out!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scandeval-0.16.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a248d8cabd7545884f90698982daa6f3cf7e07f5992a95a15e48e2588d1ad37 |
|
MD5 | cd97bc0cd819bf4b4146e748c04eb709 |
|
BLAKE2b-256 | 44492561efb9366a39b283ea99c4dae0d26c94a68aa7fc225b86ccfe41b78985 |