The Transfer Learning in Dialogue Baselines Toolkit
Project description
The Transfer Learning in Dialogue Benchmarking Toolkit
Overview
TLiDB is a tool used to benchmark methods of transfer learning in conversational AI. TLiDB can easily handle domain adaptation, task transfer, multitasking, continual learning, and other transfer learning settings. TLiDB maintains a unified json format for all datasets and tasks, easing the new code necessary for new datasets and tasks. We highly encourage community contributions to the project.
The main features of TLiDB are:
- Dataset class to easily load a dataset for use across models
- Unified metrics to standardize evaluation across datasets
- Extensible Model and Algorithm classes to support fast prototyping
Installation
Requirements
- python>=3.6
- torch>=1.10
- nltk>=3.6.5
- scikit-learn>=1.0
- transformers>=4.11.3
- sentencepiece>=0.1.96
- bert-score==0.3.11
To use TLiDB, you can simply install via pip
:
pip install tlidb
OR, you can install TLiDB from source. This is recommended if you want to edit or contribute:
git clone git@github.com:alon-albalak/TLiDB.git
cd TLiDB
pip install -e .
How to use TLiDB
TLiDB can be used from the command line or as a python command. If you have installed the package from source, we highly recommend running commands from inside the tlidb/examples/ directory.
Quick Start
For a very simple set up, you can use the following commands.
- From command line:
tlidb --source_datasets Friends --source_tasks emory_emotion_recognition --target_datasets Friends --target_tasks reading_comprehension --do_train --do_finetune --do_eval --model_config=bert
- As python command:
python3 run_experiment.py --source_datasets Friends --source_tasks emory_emotion_recognition --target_datasets Friends --target_tasks reading_comprehension --do_train --do_finetune --do_eval --model_config=bert
Detailed Usage
TLiDB has 2 main folders of interest:
tlidb/examples
tlidb/TLiDB
tlidb/examples/
is recommended for use if you would like to utilize our training scripts. It contains sample code for models, learning algorithms, and sample training scripts.
For detailed examples, see the Examples README.
tlidb/TLiDB/
holds the code related to data (datasets, dataloaders, metrics, etc.). If you are interested in utilizing our datasets and metrics but would like to train models using your own training scripts, take a look at the example usage in TLiDB README.
Folder descriptions:
- tlidb/TLiDB is the folder holding the code for data handling
- tlidb/TLiDB/data_loaders contains code for data_loaders
- tlidb/TLiDB/data is the destination folder for downloaded datasets (if installed from source, otherwise data is in .cache/tlidb/data)
- tlidb/TLiDB/datasets contains code for dataset loading and preprocessing
- tlidb/TLiDB/metrics contains code for loss and evaluation metrics
- tlidb/TLiDB/utils contains utility files
- tlidb/examples contains sample code for training and evaluating models
- tlidb/examples/algorithms contains code which trains and evaluates a model
- tlidb/examples/models contains code to define a model
- tlidb/examples/configs contains code for model configurations
- /dataset_preprocessing is for reproducability purposes. It contains scripts used to preprocess the TLiDB datasets from their original form into the standardized TLiDB format
Comments, Questions, and Feedback
If you find issues, please open an issue here.
If you have dataset or model requests, please add a new discussion here.
We encourage outside contributions to the project!
Citation
If you use TLiDB in your work, please cite the repository:
@software{Albalak_The_Transfer_Learning_2022,
author = {Albalak, Alon},
doi = {10.5281/zenodo.6374360},
month = {3},
title = {{The Transfer Learning in Dialogue Benchmarking Toolkit}},
url = {https://github.com/alon-albalak/TLiDB},
version = {1.0.0},
year = {2022}
}
Acknowledgements
The design of TLiDB was based the wilds project, and the Open Graph Benchmark.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file tlidb-1.0.3.tar.gz
.
File metadata
- Download URL: tlidb-1.0.3.tar.gz
- Upload date:
- Size: 219.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 799e0d074774dafd0368ba4d330491b5bdaadc2a317c6f5e0e64ae80f5006362 |
|
MD5 | 73efa63d8a928a470057195640e2fba3 |
|
BLAKE2b-256 | ec4434ee1cde014ac5a506ed774a13d56b2122384fbdce8ae65e816d630240f6 |