Embodied agent interface evaluation for VirtualHome

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Installation and Usage Guide for virtualhome-eval

Install dependencies

pip install virtualhome_eval

Usage

To run virtualhome_eval, use the following commands with arguments

from virtualhome_eval.agent_eval import agent_evaluation
agent_evaluation(mode=[generate_prompts, evaluate_results], eval_type=[goal_interpretation, action_sequence, transition_modeling], llm_response_path=[YOUR LLM OUTPUT DIR])

Parameters

mode: Specifies either generate prompts or evaluate results. Options are:
- generate_prompts
- evaluate_results
eval_type: Specifies the evaluation task type. Options are:
- goal_interpretation
- action_sequence
- subgoal_decomposition
- transition_model
llm_response_path: The path of LLM output directory to be evaluated. It is "" by default, using the existing outputs at directory virtualhome_eval/llm_response/. The function will evaluate all LLM outputs under the directory.
dataset: The dataset type. Options:
- virtualhome
- behavior
output_dir: The directory to store the output results. By default, it is at output/ of current path.

Example usage

To generate prompts for goal_interpretation:

agent_evaluation(mode='generate_prompts',  eval_type='goal_interpretation')

To evaluate LLM outputs for goal_interpretation:

results = agent_evaluation(mode='evaluate_results', eval_type='goal_interpretation')

To generate prompts for action_sequence:

agent_evaluation(mode='generate_prompts',  eval_type='action_sequence')

To evaluate LLM outputs for action_sequence:

results = agent_evaluation(mode='evaluate_results', eval_type='action_sequence')

To generate Virtualhome prompts for transition_model:

agent_evaluation(mode='generate_prompts',  eval_type='transition_model')

To evaluate LLM outputs on Virtualhome for transition_model:

results = agent_evaluation(mode='evaluate_results', eval_type='transition_model')

To generate prompts for subgoal_decomposition:

agent_evaluation(mode='generate_prompts',  eval_type='subgoal_decomposition')

To evaluate LLM outputs for subgoal_decomposition:

results = agent_evaluation(mode='evaluate_results', eval_type='subgoal_decomposition')

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.1

Jul 31, 2024

This version

0.1.0

Jul 28, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

virtualhome_eval-0.1.0.tar.gz (22.9 MB view details)

Uploaded Jul 28, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

virtualhome_eval-0.1.0-py3-none-any.whl (27.2 MB view details)

Uploaded Jul 28, 2024 Python 3

File details

Details for the file virtualhome_eval-0.1.0.tar.gz.

File metadata

Download URL: virtualhome_eval-0.1.0.tar.gz
Upload date: Jul 28, 2024
Size: 22.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.8.19

File hashes

Hashes for virtualhome_eval-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`028dfd7c187fc4e8a6ffdad0184674d08a26ad1825ac52423e24f61083f7c5c1`
MD5	`f174bdd657273ae698a06566baef0ee1`
BLAKE2b-256	`ac23e2fc9d98a2884e07c9e3e76d72c41b68bd6bc443318b4fd66945e30f587c`

See more details on using hashes here.

File details

Details for the file virtualhome_eval-0.1.0-py3-none-any.whl.

File metadata

Download URL: virtualhome_eval-0.1.0-py3-none-any.whl
Upload date: Jul 28, 2024
Size: 27.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.8.19

File hashes

Hashes for virtualhome_eval-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`37683de789b1bbec62d62308c4fb0e21d3d8c7ee3204c7bfc404a1fc8be49131`
MD5	`d39991f6d46284f40937fd9e437afa54`
BLAKE2b-256	`fd65068f3e23002be0644d0a31b0dba1c1d644f9d17c36272fcfd297aefc03df`

See more details on using hashes here.

virtualhome-eval 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Installation and Usage Guide for virtualhome-eval

Install dependencies

Usage

Parameters

Example usage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes