Rank transformer models for NLP tasks using transferability measures

Project description

TransformerRanker

A lightweight library to efficiently rank transformer language models for classification tasks.

There is a multitude of pre-trained language models available. Fine-tuning each to select which one scores best on your classification dataset is both time and resource expensive. TransformerRanker is a library that can be used for the model selection process, where you can choose any dataset from the HuggingFace collection of datasets, select different model candidates from the model hub, and let the tool rank them using transferability estimation metrics.

Installation

You can install the tool using pip:

pip install transformer-ranker

Three-step-interface

Step 1. Load your dataset

Choose any dataset from the datasets library:

from datasets import load_dataset

# Load your dataset using hf loader
dataset = load_dataset('conll2003')

Take a look how to load your custom dataset using HuggingFace datasets.

Step 2. Prepare a list of language models

Choose any model names from the model hub:

# Prepare a list of model handles
language_models = [
    "sentence-transformers/all-mpnet-base-v2",
    "xlm-roberta-large",
    "google/electra-large-discriminator",
    "microsoft/deberta-v3-large",
    "nghuyong/ernie-2.0-large-en",
    # ...
]

...or use our recommended list of models to try out:

language_models = prepare_popular_models('base')

Step 3. Rank Models

Initialize the ranker with your dataset and run it your models:

from transformer_ranker import TransformerRanker

# Initialize the ranker with your dataset
ranker = TransformerRanker(dataset, dataset_downsample=0.2)

# Run it with selected transformer models
results = ranker.run(language_models, batch_size=64)

Review ranked models:

print(results)

Display results showing models sorted by their transferability scores:

Rank 1. microsoft/deberta-v3-large: 2.7962
Rank 2. nghuyong/ernie-2.0-large-en: 2.7788
Rank 3. google/electra-large-discriminator: 2.7486
Rank 4. xlm-roberta-large: 2.6695
Rank 5. sentence-transformers/all-mpnet-base-v2: 2.5709
...

Using these results you can exclude the lower-ranked models to only focus on the top-ranked models for further exploration.

License

MIT

Project details

Release history Release notifications | RSS feed

0.2.0

May 1, 2025

0.1.3

Apr 21, 2025

0.1.2

Dec 3, 2024

0.1.1

Oct 28, 2024

This version

0.1.0

Aug 6, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transformer-ranker-0.1.0.tar.gz (18.8 kB view details)

Uploaded Aug 6, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

transformer_ranker-0.1.0-py3-none-any.whl (20.7 kB view details)

Uploaded Aug 6, 2024 Python 3

File details

Details for the file transformer-ranker-0.1.0.tar.gz.

File metadata

Download URL: transformer-ranker-0.1.0.tar.gz
Upload date: Aug 6, 2024
Size: 18.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.12

File hashes

Hashes for transformer-ranker-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`15688cb9fac28ea3478ef7368b8555de70738bd1b16b30ebdc3c537c5d0a4426`
MD5	`4df42f4a205bb7792755fe20cecf138f`
BLAKE2b-256	`9729791c52c1661af3a9e5125ee9be66a2f9bb1baf10dbc52d4df8dca084ceea`

See more details on using hashes here.

File details

Details for the file transformer_ranker-0.1.0-py3-none-any.whl.

File metadata

Download URL: transformer_ranker-0.1.0-py3-none-any.whl
Upload date: Aug 6, 2024
Size: 20.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.12

File hashes

Hashes for transformer_ranker-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e26a5253ca48eb0464dc08086d68d7cbc850c8c777cbe4a61fa8ac2bff42bd36`
MD5	`2133df4b4e68b8d2b3427892d4fb088a`
BLAKE2b-256	`96afb838dab898d5be8ec145f9c5ff1d5669caa7f8913242cda46e965c4700d3`

See more details on using hashes here.

transformer-ranker 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

TransformerRanker

Installation

Three-step-interface

Step 1. Load your dataset

Step 2. Prepare a list of language models

Step 3. Rank Models

Review ranked models:

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes