Skip to main content

The package provides a desktop environment for setting and evaluating desktop automation tasks.

Project description

DesktopEnv: An Environment towards Human-like Computer Task Mastery

SLOGAN

WebsitePaper

Overview

Updates

  • 2024-03-01:

Install

  1. Install VMWare and configure vmrun command: Please refer to guidance

  2. Install the environment package, download the examples and the virtual machine image.

pip install desktop_env
mkdir -p ~/.desktop_env
wget xxxx
wget xxxx

Quick Start

Run the following minimal example to interact with the environment:

import json
from desktop_env.envs.desktop_env import DesktopEnv

with open("evaluation_examples/examples/gimp/f723c744-e62c-4ae6-98d1-750d3cd7d79d.json", "r", encoding="utf-8") as f:
    example = json.load(f)

env = DesktopEnv(
    path_to_vm=r"path_to_vm",
    action_space="computer_13",
    task_config=example
)
observation = env.reset()

observation, reward, done, info = env.step({"action_type": "CLICK", "parameters": {"button": "right", "num_clicks": 1}})

Annotation Tool Usage

We provide an annotation tool to help you annotate the examples.

Agent Usage

We provide a simple agent to interact with the environment. You can use it as a starting point to build your own agent.

Road map of infra (Proposed)

  • Explore VMWare, and whether it can be connected and control through mouse package
  • Explore Windows and MacOS, whether it can be installed
    • MacOS is closed source and cannot be legally installed
    • Windows is available legally and can be installed
  • Build gym-like python interface for controlling the VM
  • Recording of actions (mouse movement, click, keyboard) for humans to annotate, and we can replay it and compress it
  • Build a simple task, e.g. open a browser, open a website, click on a button, and close the browser
  • Set up a pipeline and build agents implementation (zero-shot) for the task
  • Start to design on which tasks inside the DesktopENv to focus on, start to wrap up the environment to be public
  • Start to annotate the examples for training and testing
  • Error handling during file passing and file opening, etc.
  • Add accessibility tree from the OS into the observation space
  • Add pre-process and post-process action support for benchmarking setup and evaluation
  • Multiprocess support, this can enable the reinforcement learning to be more efficient
  • Experiment logging and visualization system
  • Add more tasks, maybe scale to 300 for v1.0.0, and create a dynamic leaderboard

Road map of benchmark, tools and resources (Proposed)

  • Improve the annotation tool base on DuckTrack, make it more robust which align on accessibility tree
  • Annotate the steps of doing the task
  • Build a website for the project
  • Crawl all resources we explored from the internet, and make it easy to access
  • Set up ways for community to contribute new examples

Citation

If you find this environment useful, please consider citing our work:

@article{DesktopEnv,
  title={},
  author={},
  journal={arXiv preprint arXiv:xxxx.xxxx},
  year={2024}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

desktop_env-0.1.1.tar.gz (70.2 kB view hashes)

Uploaded Source

Built Distribution

desktop_env-0.1.1-py3-none-any.whl (79.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page