Skip to main content

No project description provided

Project description

Evaluation of language models on mono- or multilingual Scandinavian language tasks.


LastCommit ReadTheDocs PyPI Status License

Installation

To install the package simply write the following command in your favorite terminal:

$ pip install scandeval[all]

This will install all the model frameworks currently supported (pytorch, tensorflow, jax and spacy). If you know you only need one of these, you can install a slimmer package like so:

$ pip install scandeval[pytorch]

Lastly, if you are not interesting in benchmarking models, but just want to use the package to download datasets, then the following command will do the trick:

$ pip install scandeval

Quickstart

Benchmarking from the Command Line

The easiest way to benchmark models is via the command line interface. After having installed the package, you can benchmark your favorite model like so:

$ scandeval --model_id <model_id>

Here model_id is the HuggingFace model ID, which can be found on the HuggingFace Hub. By default this will benchmark the model on all the datasets eligible. If you want to benchmark on a specific dataset, this can be done via the --dataset flag. This will for instance evaluate the model on the AngryTweets dataset:

$ scandeval --model_id <model_id> --dataset angry-tweets

We can also separate by language. To benchmark all Danish models, say, this can be done using the language tag, like so:

$ scandeval --language da

Multiple models, datasets and/or languages can be specified by just attaching multiple arguments. Here is an example with two models:

$ scandeval --model_id <model_id1> --model_id <model_id2> --dataset angry-tweets

See all the arguments and options available for the scandeval command by typing

$ scandeval --help

Benchmarking from a Script

In a script, the syntax is similar to the command line interface. You simply initialise an object of the Benchmark class, and call this benchmark object with your favorite models and/or datasets:

>>> from scandeval import Benchmark
>>> benchmark = Benchmark()
>>> benchmark('<model_id>')

To benchmark on a specific dataset, you simply specify the second argument, shown here with the AngryTweets dataset again:

>>> benchmark('<model_id>', 'angry-tweets')

This would benchmark all Danish models:

>>> benchmark(language='da')

See the documentation for a more in-depth description.

Downloading Datasets

If you are just interested in downloading a dataset rather than benchmarking, this can be done as follows:

>>> from scandeval.datasets import load_angry_tweets
>>> X_train, X_test, y_train, y_test = load_angry_tweets()

Here X_train and X_test will be lists containing the relevant texts, and y_train and y_test will be lists containing the associated labels.

See the documentation for a list of all the datasets that can be loaded.

Documentation

The full documentation can be found on ReadTheDocs.

Citing ScandEval

If you want to cite the framework then feel free to use this:

@article{nielsen2021scandeval,
  title={ScandEval: Evaluation of language models on mono- or multilingual Scandinavian language tasks.},
  author={Nielsen, Dan Saattrup},
  journal={GitHub. Note: https://github.com/saattrupdan/ScandEval},
  year={2021}
}

Remarks

The image used in the logo has been created by the amazing Scandinavia and the World team. Go check them out!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scandeval-0.12.0.tar.gz (29.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scandeval-0.12.0-py3-none-any.whl (47.2 kB view details)

Uploaded Python 3

File details

Details for the file scandeval-0.12.0.tar.gz.

File metadata

  • Download URL: scandeval-0.12.0.tar.gz
  • Upload date:
  • Size: 29.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.3

File hashes

Hashes for scandeval-0.12.0.tar.gz
Algorithm Hash digest
SHA256 e04d7132a45df5ad6beb35b0dc8c208c4355c90ff1fa9e930971639655f646e0
MD5 ea930e387f877bf5d87b1a080c8c0abf
BLAKE2b-256 f0da2275e5e65588da4d3d2ad84ee53653d2de91b83a45771536a0bf85375cb6

See more details on using hashes here.

File details

Details for the file scandeval-0.12.0-py3-none-any.whl.

File metadata

  • Download URL: scandeval-0.12.0-py3-none-any.whl
  • Upload date:
  • Size: 47.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.3

File hashes

Hashes for scandeval-0.12.0-py3-none-any.whl
Algorithm Hash digest
SHA256 815e5e4be2680bb3eb4ab8fb277c7826ae3a9a3aecd1b8b8a4ca6a6d87dc563a
MD5 09e3452f481fa258e28f28828a239614
BLAKE2b-256 5dca8f21e2d726ce0dced93879f8a90291a27946edfcc153f96d5646fcb4d6e0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page