An Open-source Dialog System Toolkit
Project description
ConvLab-3
ConvLab-3 is a flexible dialog system platform based on a unified data format for task-oriented dialog (TOD) datasets. The unified format serves as the adapter between TOD datasets and models: datasets are first transformed to the unified format and then loaded by models. In this way, the cost of adapting $M$ models to $N$ datasets is reduced from $M\times N$ to $M+N$. While retaining all features of ConvLab-2, ConvLab-3 greatly enlarges supported datasets and models thanks to the unified format, and enhances the utility of reinforcement learning (RL) toolkit for dialog policy module. For typical usage, see our paper. Datasets and Trained models are also available on Hugging Face Hub.
Updates
- 2022.11.30: ConvLab-3 release.
Installation
You can install ConvLab-3 in one of the following ways according to your need. Higher versions of torch
and transformers
may also work.
Git clone and pip install in development mode (Recommend)
For the latest and most configurable version, we recommend installing ConvLab-3 in development mode.
Clone the newest repository:
git clone --depth 1 https://github.com/ConvLab/ConvLab-3.git
Install ConvLab-3 via pip:
cd ConvLab-3
pip install -e .
Pip install from PyPI
To use ConvLab-3 as an off-the-shelf tool, you can install via:
pip install convlab
Note that the data
directory will not be included due to the package size limitation.
Using Docker
We also provide Dockerfile for building docker. Basically it uses the requirement.txt
and then installs ConvLab-3 in development mode.
# create image
docker build -t convlab .
# run container
docker run -dit convlab
# open bash in container
docker exec -it CONTAINER_ID bash
Tutorials
- Getting Started (Have a try on Colab!)
- Introduction to Unified Data Format
- Utility functions for unified datasets
- RL Toolkit
- Interactive Tool [demo video]
Unified Datasets
Current datasets in unified data format: (DA-U/DA-S stands for user/system dialog acts)
Dataset | Dialogs | Goal | DA-U | DA-S | State | API result | DataBase |
---|---|---|---|---|---|---|---|
Camrest | 676 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
WOZ 2.0 | 1200 | :white_check_mark: | :white_check_mark: | ||||
KVRET | 3030 | :white_check_mark: | :white_check_mark: | :white_check_mark: | |||
DailyDialog | 13118 | :white_check_mark: | |||||
Taskmaster-1 | 13175 | :white_check_mark: | :white_check_mark: | :white_check_mark: | |||
Taskmaster-2 | 17303 | :white_check_mark: | :white_check_mark: | :white_check_mark: | |||
MultiWOZ 2.1 | 10438 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
Schema-Guided | 22825 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | ||
MetaLWOZ | 40203 | :white_check_mark: | |||||
CrossWOZ (zh) | 6012 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
Taskmaster-3 | 23757 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
Unified datasets are available under data/unified_datasets
directory as well as Hugging Face Hub. We will continue adding more datasets listed in this issue. If you want to add a listed/custom dataset to ConvLab-3, you can create an issue for discussion and then create pull-request. We will list you as the contributors and highly appreciate your contributions!
Models
We list newly integrated models in ConvLab-3 that support unified data format and obtain strong performance. You can follow the link for more details about these models. Other models can be used in the same way as in ConvLab-2.
Task | Models | Input | Output |
---|---|---|---|
Response Generation | T5 | Context | Response |
Goal-to-Dialogue | T5 | Goal | Dialog |
Natural Language Understanding | T5, BERTNLU, MILU | Context | DA-U |
Dialog State Tracking | T5, SUMBT, SetSUMBT, TripPy | Context | State |
RL Policy | DDPT, PPO, PG | State, DA-U, DB | DA-S |
Natural Language Generation | T5, SC-GPT | DA-S | Response |
End-to-End | SOLOIST | Context, DB | State, Response |
User simulator | TUS, GenTUS | Goal, DA-S | DA-U, (Response) |
Trained models are available on Hugging Face Hub.
Contributing
We welcome contributions from community. Please see issues to find what we need.
- If you want to add a new dataset, model, or other feature, please describe the dataset/model/feature in an issue before creating pull-request.
- Small change like fixing a bug can be directly made by a pull-request.
Team
ConvLab-3 is maintained and developed by Tsinghua University Conversational AI group (THU-COAI), the Dialogue Systems and Machine Learning Group at Heinrich Heine University, Düsseldorf, Germany and Microsoft Research (MSR).
We would like to thank all contributors of ConvLab:
Yan Fang, Zhuoer Feng, Jianfeng Gao, Qihan Guo, Kaili Huang, Minlie Huang, Sungjin Lee, Bing Li, Jinchao Li, Xiang Li, Xiujun Li, Jiexi Liu, Lingxiao Luo, Wenchang Ma, Mehrad Moradshahi, Baolin Peng, Runze Liang, Ryuichi Takanobu, Dazhen Wan, Hongru Wang, Jiaxin Wen, Yaoqin Zhang, Zheng Zhang, Qi Zhu, Xiaoyan Zhu, Carel van Niekerk, Christian Geishauser, Hsien-chin Lin, Nurul Lubis, Xiaochen Zhu, Michael Heck, Shutong Feng, Milica Gašić.
License
Apache License 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file convlab-3.0.0.dev20221130.tar.gz
.
File metadata
- Download URL: convlab-3.0.0.dev20221130.tar.gz
- Upload date:
- Size: 26.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e398486ba26acc5b5dde5d2a19ada403ac9539fd3ece7cc258b4b200df13ca5e |
|
MD5 | cd8bbac249dd0b664e7109e6a5a31be5 |
|
BLAKE2b-256 | e805893c40aae3a4171a046e6f6086040645e61506a20060fb0f8c13db6820ae |
File details
Details for the file convlab-3.0.0.dev20221130-py3-none-any.whl
.
File metadata
- Download URL: convlab-3.0.0.dev20221130-py3-none-any.whl
- Upload date:
- Size: 27.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9321038b7e193660e3c8110eeb940d7fe33bc90eb14abfe7d6afac81be815f7b |
|
MD5 | 6ce69e7b033eb9f250e4f96a0f6679fa |
|
BLAKE2b-256 | 12fb205e450d6e994b5c6bd6f99cd6ea532c13cf04a977557cbb99ff93732d2c |