Seamless integration of tasks with huggingface models
Project description
tasknet
tasknet
is an interface between Huggingface datasets and Huggingface Trainer.
Task templates
tasknet
relies on task templates to avoid boilerplate codes. The task templates correspond to Transformers AutoClasses:
SequenceClassification
TokenClassification
MultipleChoice
The task templates follow the same interface. They implement preprocess_function
, a data collator and compute_metrics
.
Look at tasks.py and use existing templates as a starting point to implement a custom task template.
Instanciate a task
Each task template is associated with specific fields. Classification has two text fields s1
,s2
, and a label y
. Pass a dataset to a template, and fill-in the mapping between the dataset fields and the template fields to instanciate a task.
import tasknet as tn
from datasets import load_dataset
rte = tn.Classification(
dataset=load_dataset("glue", "rte"),
s1="sentence1", s2="sentence2", y="label"
)
class args:
model_name='roberta-base'
learning_rate = 3e-5 # see https://huggingface.co/docs/transformers/v4.24.0/en/main_classes/trainer#transformers.TrainingArguments
tasks = [rte]
model = tn.Model(tasks, args)
trainer = tn.Trainer(model, tasks, args)
trainer.train()
As you can see, tasknet is multitask by design. It works with list of tasks and the model creates a task_models_list
attribute.
Installation
pip install tasknet
Additional examples:
Colab:
https://colab.research.google.com/drive/15Xf4Bgs3itUmok7XlAK6EEquNbvjD9BD?usp=sharing
tasknet vs jiant
jiant is another library comparable to tasknet. tasknet is a minimal extension of Trainer
centered on task templates, while jiant builds a custom analog of Trainer
from scratch called runner
.
tasknet
is leaner and easier to extend. jiant is config-based while tasknet is designed for interative use and scripting.
Credit
This code uses some part of the examples of the transformers library and some code from multitask-learning-transformers.
Contact
You can request features on github or reach me at damien.sileo@inria.fr
@misc{sileod21-tasknet,
author = {Sileo, Damien},
doi = {10.5281/zenodo.561225781},
month = {11},
title = {{tasknet, multitask interface between Trainer and datasets}},
url = {https://github.com/sileod/tasknet},
version = {1.5.0},
year = {2022}}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tasknet-1.13.0.tar.gz
.
File metadata
- Download URL: tasknet-1.13.0.tar.gz
- Upload date:
- Size: 18.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0cad4d4be3ac896fbdab39dcf30ab3aa617420ba8a470d194839f99837cb3fff |
|
MD5 | e10ce93dac27d151f916ab2a5a1ae777 |
|
BLAKE2b-256 | 7d1a50d15c6924dd6f8c43ddba4e94afacfb7c02335817e07484f3ab56d5e346 |
Provenance
File details
Details for the file tasknet-1.13.0-py3-none-any.whl
.
File metadata
- Download URL: tasknet-1.13.0-py3-none-any.whl
- Upload date:
- Size: 15.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dde1163c3fbb5ca3415efde80d148eef1deaf8f0d88c41f7737b97b20afefeae |
|
MD5 | c65bc8cd15640534d0f7abe5d6e38a9b |
|
BLAKE2b-256 | 9d63077b15e20c455ea290d8ab0068e808d6da0cbe2f08c9f6ed99caa6cd6f91 |