Skip to main content

LLM-powered web automation library with autonomous agents and natural language selectors

Project description

webtask

LLM-powered web automation library with autonomous agents and natural language selectors.


What it does

Three ways to use it:

High-level - Give it a task, let it figure out the steps Step-by-step - Execute tasks one step at a time for debugging/control Low-level - Tell it exactly what to do with natural language selectors

Uses LLMs to understand pages, plan actions, and select elements. Built with Playwright for the browser stuff.


Quick look

Setup:

from webtask import Webtask
from webtask.integrations.llm import GeminiLLM

# Create Webtask manager (browser launches lazily)
wt = Webtask()

# Choose your LLM (Gemini or OpenAI)
llm = GeminiLLM.create(model="gemini-2.5-flash")

# Create agent
agent = await wt.create_agent(llm=llm)

High-level autonomous:

# Agent figures out the steps
result = await agent.execute("search for cats and click the first result")
print(f"Completed: {result.completed}")

Step-by-step execution:

# Execute task one step at a time
agent.set_task("add 2 items to cart")

for i in range(10):
    step = await agent.run_step()

    print(f"Step {i+1}: {len(step.proposals)} actions")
    print(f"Verification: {step.verification.message}")

    if step.verification.complete:
        break

# Useful for debugging, progress tracking, or custom control flow

Low-level imperative:

# You control the steps, agent handles the selectors
await agent.navigate("https://google.com")

search_box = await agent.select("search box")
await search_box.fill("cats")

button = await agent.select("search button")
await button.click()

# Wait for page to stabilize
await agent.wait_for_idle()

# Take screenshot
await agent.screenshot("result.png")

No CSS selectors. No XPath. Just describe what you want.


How it works

High-level mode - The agent loop:

  1. Proposer looks at the page and task, decides next action
  2. Executer runs it (click, type, navigate, etc.)
  3. Verifier checks if task complete
  4. Repeat until done

Step-by-step mode - Same as high-level but you control the loop:

  • agent.set_task(description) - Set the task
  • agent.execute_step() - Execute one step (propose → execute → verify)
  • agent.clear_history() - Reset for new task

Low-level mode - You call methods directly:

  • agent.navigate(url) - Go to a page
  • agent.select(description) - Find element by natural language
  • element.click(), element.fill(text), element.type(text) - Interact with elements
  • agent.wait(seconds) - Wait for specific duration
  • agent.wait_for_idle() - Wait for network/DOM to stabilize
  • agent.screenshot(path) - Capture page screenshot

All modes use the same core: LLM sees cleaned DOM with element IDs like button-0 instead of raw HTML. Clean input, clean output.


Status

🚧 Work in progress

Core implementation complete. See TODO for testing plan and future work.


Benchmarks

Evaluate webtask on standard web agent benchmarks:

webtask-benchmarks - Evaluation framework for Mind2Web and other benchmarks


Install

pip install pywebtask
playwright install chromium

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pywebtask-0.1.0.tar.gz (42.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pywebtask-0.1.0-py3-none-any.whl (64.5 kB view details)

Uploaded Python 3

File details

Details for the file pywebtask-0.1.0.tar.gz.

File metadata

  • Download URL: pywebtask-0.1.0.tar.gz
  • Upload date:
  • Size: 42.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for pywebtask-0.1.0.tar.gz
Algorithm Hash digest
SHA256 24571f2d03de8b9b475580849bf347efa5b243618838cc593efe1264fcdae6be
MD5 1c3d1c78d34231da757537dc75ccdc0c
BLAKE2b-256 a8b6ea61e13f37e89c88942d0e103c407184d6ad192371958f7425eead1555d0

See more details on using hashes here.

File details

Details for the file pywebtask-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pywebtask-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 64.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for pywebtask-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2e88ca4c076da56fbb9fb43b0c6ecdc19b446c28476e313eb7befc22f9d34c3c
MD5 91649b070d3a2b32d5177c549c2a2c0b
BLAKE2b-256 7f0d251563e6a8530ebc9f8fe2eb3b0e0b5c244ff8f3e2d732e8e9d88e115fd0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page