Skip to main content

No project description provided

Project description

LiteWebAgent

repo owner: Danni (Danqing) Zhang (danqing.zhang.personal@gmail.com)

main contributors: Balaji Rama (balajirw10@gmail.com), Shiying He (sy.he0303@gmail.com) and Jingyi Ni (jingyi.ni.personal@gmail.com)

Please note that the LiteWebAgent repository is in development mode. We have open-sourced the repository to foster collaboration between contributors.

📰 News

  • [2024-10-01] Completed a major refactoring of LiteWebAgent to make it flexible for importing the package, enabling the addition of web browsing capabilities to any AI agent.
  • [2024-09-20] We reimplemented the paper Tree Search for Language Model Agents in the LiteWebAgent framework. Now, the search agent is capable of exploring different trajectories for accomplishing web browsing tasks and returning the most promising one. This is useful for finding the optimal path to complete complex web browsing tasks in an offline manner.
  • [2024-08-22] The initial version of LiteWebAgent was released, providing a robust framework for using natural language to control a web agent.

1. QuickStart

From PyPI: https://pypi.org/project/litewebagent/

pip install litewebagent 

Then, a required step is to setup playwright by running

playwright install chromium

Then please create a .env file, and update your API keys:

cp .env.example .env

You are ready to go! Try FunctionCallingAgent on google.com

python examples/google_test.py

2. Development mode

(1) Installation

Set up locally

First set up virtual environment, and allow your code to be able to see 'litewebagent'

python3 -m venv venv
. venv/bin/activate
pip install -e .

Then please create a .env file, and update your API keys:

cp .env.example .env

(2) Try different agents

  • use prompting-based web agent to finish some task and save the workflow
python -m prompting_main --agent_type PromptAgent --starting_url https://www.google.com --goal 'search dining table' --plan 'search dining table' --log_folder log
  • we also provide function-calling-based web agent
python -m function_calling_main --agent_type FunctionCallingAgent --starting_url https://www.google.com --goal 'search dining table' --plan 'search dining table' --log_folder log
python -m function_calling_main --agent_type HighLevelPlanningAgent --starting_url https://www.google.com --goal 'search dining table' --plan 'search dining table' --log_folder log
python -m function_calling_main --agent_type ContextAwarePlanningAgent --starting_url https://www.google.com --goal 'search dining table' --plan 'search dining table' --log_folder log

https://www.loom.com/share/1018bcc4e21c4a7eb517b60c2931ee3c https://www.loom.com/share/aa48256478714d098faac740239c9013 https://www.loom.com/share/89f5fa69b8cb49c8b6a60368ddcba103

  • replay the workflow verified by the web agent If you haven't used the web agent to try any tests yet, first copy our example.json file.
cp log/flow/example.json log/flow/steps.json 

then you can replay the session

python litewebagent/action/replay.py --log_folder log
  • enable user agent interaction
python -m cli_main --agent_type FunctionCallingAgent --log_folder log
python -m cli_main --agent_type HighLevelPlanningAgent --log_folder log
python -m cli_main --agent_type PromptAgent --log_folder log

https://www.loom.com/share/93e3490a6d684cddb0fbefce4813902a

(3) test different input features

We use axtree by default. Alternatively, you can provide a comma-separated string listing the desired input feature types.

python -m function_calling_main --agent_type FunctionCallingAgent --starting_url https://www.airbnb.com --goal 'set destination as San Francisco, then search the results' --plan '(1) enter the "San Francisco" as destination, (2) and click search' --log_folder log
python -m function_calling_main --agent_type FunctionCallingAgent --starting_url https://www.airbnb.com --goal 'set destination as San Francisco, then search the results' --plan '(1) enter the "San Francisco" as destination, (2) and click search' --features interactive_elements --log_folder log
python -m function_calling_main --agent_type FunctionCallingAgent --starting_url https://www.airbnb.com --goal 'set destination as San Francisco, then search the results' --plan '(1) enter the "San Francisco" as destination, (2) and click search' --features axtree,interactive_elements --log_folder log

(4) search_agent

python -m search_main --agent_type PromptSearchAgent --starting_url https://www.google.com --goal 'search dining table' --plan 'search dining table' --search_algorithm 'bfs' --log_folder log

https://www.loom.com/share/986f0addf10949d88ae25cd802588a85

3. Paper reimplementation

Paper Agent
SoM (Set-of-Mark) Agent PromptAgent
Mind2Web ContextAwarePlanningAgent
Tree Search for Language Model Agents PromptSearchAgent

4. Citing LiteWebAgent

@misc{zhang2024litewebagent,
  title={LiteWebAgent},
  author={Zhang, Danqing and Rama, Balaji and He, Shiying and Ni, Jingyi},
  journal={https://danqingz.github.io/blog/2024/08/22/LiteWebAgent.html},
  year={2024}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litewebagent-0.1.7.tar.gz (39.3 kB view details)

Uploaded Source

Built Distribution

litewebagent-0.1.7-py3-none-any.whl (50.8 kB view details)

Uploaded Python 3

File details

Details for the file litewebagent-0.1.7.tar.gz.

File metadata

  • Download URL: litewebagent-0.1.7.tar.gz
  • Upload date:
  • Size: 39.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for litewebagent-0.1.7.tar.gz
Algorithm Hash digest
SHA256 3cdda7b1a64383ac7f2b5bcdf87d7b3ae66e388c85fd578f4105ac99dc90d4cc
MD5 8f1968e9d2b859126f0cbabf34addf2e
BLAKE2b-256 f3cbd53c5e5c2e430634c4553ebcc94ada6fdd8811479329e6a19a5a6086de07

See more details on using hashes here.

File details

Details for the file litewebagent-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: litewebagent-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 50.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for litewebagent-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 1a567904576125f7b01a1ca99303253818cba1754349ceb0bbe44326c1ed7b04
MD5 362bee3bc2ce44098246ea56cef296bd
BLAKE2b-256 41b21d48fdf8e3e89c972302a9f4ec77e356ad3754f6c6285f3dd037432e83db

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page