Embodied agent interface evaluation for VirtualHome
Project description
Installation and Usage Guide for virtualhome-eval
Install dependencies
pip install virtualhome_eval
Usage
To run virtualhome_eval, use the following commands with arguments
from virtualhome_eval.agent_eval import agent_evaluation
agent_evaluation(mode=[generate_prompts, evaluate_results], eval_type=[goal_interpretation, action_sequence, transition_modeling], llm_response_path=[YOUR LLM OUTPUT DIR])
Parameters
mode: Specifies either generate prompts or evaluate results. Options are:generate_promptsevaluate_results
eval_type: Specifies the evaluation task type. Options are:goal_interpretationaction_sequencesubgoal_decompositiontransition_model
llm_response_path: The path of LLM output directory to be evaluated. It is""by default, using the existing outputs at directoryvirtualhome_eval/llm_response/. The function will evaluate all LLM outputs under the directory.dataset: The dataset type. Options:virtualhomebehavior
output_dir: The directory to store the output results. By default, it is atoutput/of current path.
Example usage
- To generate prompts for
goal_interpretation:
agent_evaluation(mode='generate_prompts', eval_type='goal_interpretation')
- To evaluate LLM outputs for
goal_interpretation:
results = agent_evaluation(mode='evaluate_results', eval_type='goal_interpretation')
- To generate prompts for
action_sequence:
agent_evaluation(mode='generate_prompts', eval_type='action_sequence')
- To evaluate LLM outputs for
action_sequence:
results = agent_evaluation(mode='evaluate_results', eval_type='action_sequence')
- To generate Virtualhome prompts for
transition_model:
agent_evaluation(mode='generate_prompts', eval_type='transition_model')
- To evaluate LLM outputs on Virtualhome for
transition_model:
results = agent_evaluation(mode='evaluate_results', eval_type='transition_model')
- To generate prompts for
subgoal_decomposition:
agent_evaluation(mode='generate_prompts', eval_type='subgoal_decomposition')
- To evaluate LLM outputs for
subgoal_decomposition:
results = agent_evaluation(mode='evaluate_results', eval_type='subgoal_decomposition')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
virtualhome_eval-0.1.0.tar.gz
(22.9 MB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file virtualhome_eval-0.1.0.tar.gz.
File metadata
- Download URL: virtualhome_eval-0.1.0.tar.gz
- Upload date:
- Size: 22.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.8.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
028dfd7c187fc4e8a6ffdad0184674d08a26ad1825ac52423e24f61083f7c5c1
|
|
| MD5 |
f174bdd657273ae698a06566baef0ee1
|
|
| BLAKE2b-256 |
ac23e2fc9d98a2884e07c9e3e76d72c41b68bd6bc443318b4fd66945e30f587c
|
File details
Details for the file virtualhome_eval-0.1.0-py3-none-any.whl.
File metadata
- Download URL: virtualhome_eval-0.1.0-py3-none-any.whl
- Upload date:
- Size: 27.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.8.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37683de789b1bbec62d62308c4fb0e21d3d8c7ee3204c7bfc404a1fc8be49131
|
|
| MD5 |
d39991f6d46284f40937fd9e437afa54
|
|
| BLAKE2b-256 |
fd65068f3e23002be0644d0a31b0dba1c1d644f9d17c36272fcfd297aefc03df
|