Skip to main content

DialSim package

Reason this release was yanked:

obsolete. The logic is sane, but some features have been changed or removed. Therefore, the code would not work now. Yanking this for legacy reasons.

Project description

DialSim

We introduce DialSim, a real-time dialogue simulator. In this simulator, an agent is assigned the role of a character from popular TV shows, requiring it to respond to spontaneous questions using past dialogue information and to distinguish between known and unknown information. Key features of DialSim include evaluating the agent’s ability to respond within a reasonable time limit, handling long-term multi-party dialogues, and managing adversarial settings (e.g., swap character names) to challenge the agent’s reliance on pre-trained knowledge.

The dataset is released along with our paper. For further details, please refer to our paper.

Dataset

You can download the dataset here.

v1.0 (April 2024): This version includes the dataset as described in the paper.

v1.1 (June 2024): To incorporate more diverse and challenging data, this version has been updated to include unanswerable multi-hop questions.

Experimental Setup

After downloading appropriate version of torch, do:

  1. pip install -r requirements.txt

  2. mkdir data

  3. mv dialsim_v1.1.zip ./data/

  4. cd data

  5. unzip dialsim_v1.1.zip

Simulation

Command Example: CUDA_VISIBLE_DEVICES=0 python simulator.py --model_name "GPT-3.5" --quantization "4bit" --script_name "friends" --sleep_time 6 --history_type "session-entire" --ret_method "bm25" --trial_version 0 --sh_number 0 --num_cores 10 --openai_api_key "<<YOUR_OPENAI_API_KEY>>"

Arguments

  • model_name: Specifies the model to use, default is "GPT-3.5". Options include "llama2-7b-chat", "llama2-70b-chat", "tulu2-7b-dpo", "tulu2-70b-dpo", "gemma-2b-it", "gemma-7b-it", "mistral-7b-it", "mixtral-it", "GPT-3.5", "GPT-4", "claude-3", "claude-2.1", and "gemini".
  • quantization: Model quantization level, default is "no". Options include "no", "16bit", "8bit", and "4bit".
  • script_name: TV show script for the simulation, default is "friends". Options include "friends", "bigbang", and "theoffice".
  • sleep_time: Response time limit, default: 5
  • history_type: Method for saving history, default is "session-entire". Options include "utts", "session-entire", and "session-summary".
  • num_ret_history: Number of retrieved histories to use. Modify lines 184-242 in simulator.py to change this number.
  • ret_method: Retrieval method, default is "bm25". Options include "openai-emb", "bm25", "no_ret", and "oracle".
  • name_shuffle: Type of adversarial test, default is "original". Options include "original", "shuffle", and "new_name".
  • trial_version: Experiment version number, default: 0
  • sh_number: Shell script number, default: 0
  • num_cores: Maximum number of CPU cores to use, default: 10
  • openai_api_key: Required if using "GPT-3.5", "GPT-4" or ret_method="openai-emb".
  • gemini_api_key: Required if using "gemini" in the model name.
  • anthropic_api_key: Required if using "claude-3" or "claude-2.1" in the model name.
  • fast_eval: When set to "yes", the simulator proceeds to the next utterance without waiting for the time interval if the history has already been updated. The default setting is "yes". Options include "yes" and "no".

Python Package

We plan to make this simulator into a Python package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dialsim-0.0.13.tar.gz (30.1 kB view details)

Uploaded Source

Built Distribution

dialsim-0.0.13-py3-none-any.whl (35.3 kB view details)

Uploaded Python 3

File details

Details for the file dialsim-0.0.13.tar.gz.

File metadata

  • Download URL: dialsim-0.0.13.tar.gz
  • Upload date:
  • Size: 30.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for dialsim-0.0.13.tar.gz
Algorithm Hash digest
SHA256 fb04bca3b454e7b7f82e43e8d6dbe266ac7a7e6a489560f9550c9da797bda178
MD5 fca98150c94bbae1747446fb0080038b
BLAKE2b-256 98ef44d8d52ebabec7a7b8d4dd782fae4c1cdc7f7fb4d3384326653b6583c98f

See more details on using hashes here.

File details

Details for the file dialsim-0.0.13-py3-none-any.whl.

File metadata

  • Download URL: dialsim-0.0.13-py3-none-any.whl
  • Upload date:
  • Size: 35.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for dialsim-0.0.13-py3-none-any.whl
Algorithm Hash digest
SHA256 fabb831c2ac39fddefbdd9f2f6b22eb56c622f30d97fc6a43c8fa381e70de144
MD5 7b405b85cc9164fa31220a38540844f4
BLAKE2b-256 ee877e91f6e8925c691803f20f42a62010934ec7ec295feaa10477a1007a2158

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page