LLM-powered web automation library with autonomous agents and natural language selectors

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

stevewang2000

These details have not been verified by PyPI

Project description

webtask

LLM-powered web automation library with autonomous agents and natural language selectors.

📚 Documentation | 🐍 PyPI | 📊 Benchmarks

What it does

Three ways to use it:

High-level - Give it a task, let it figure out the steps Step-by-step - Execute tasks one step at a time for debugging/control Low-level - Tell it exactly what to do with natural language selectors

Uses multimodal LLMs (GPT-4 Vision, Gemini 2.5) to understand pages visually and through DOM. Sends screenshots with bounding boxes by default for better accuracy. Built with Playwright for the browser stuff.

Quick look

Setup:

from webtask import Webtask
from webtask.integrations.llm import GeminiLLM

# Create Webtask manager (browser launches lazily)
wt = Webtask()

# Choose your LLM (Gemini or OpenAI)
llm = GeminiLLM.create(model="gemini-2.5-flash")

# Create agent (screenshots with bounding boxes enabled by default)
agent = await wt.create_agent(llm=llm)

# Or disable screenshots for faster/cheaper operation
# agent = await wt.create_agent(llm=llm, use_screenshot=False)

High-level autonomous:

# Agent figures out the steps
result = await agent.execute("search for cats and click the first result")
print(f"Completed: {result.completed}")

Step-by-step execution:

# Execute task one step at a time
agent.set_task("add 2 items to cart")

for i in range(10):
    step = await agent.run_step()

    print(f"Step {i+1}: {len(step.proposal.actions)} actions")
    print(f"Status: {step.proposal.message}")

    if step.proposal.complete:
        break

# Useful for debugging, progress tracking, or custom control flow

Low-level imperative:

# You control the steps, agent handles the selectors
await agent.navigate("https://google.com")

search_box = await agent.select("search box")
await search_box.fill("cats")

button = await agent.select("search button")
await button.click()

# Wait for page to stabilize
await agent.wait_for_idle()

# Take screenshot
await agent.screenshot("result.png")

No CSS selectors. No XPath. Just describe what you want.

How it works

High-level mode - The agent loop:

Proposer looks at the page (text DOM + screenshot with bounding boxes) and task, proposes next actions AND checks if task is complete
Executer runs the actions (navigate, click, fill, type)
Repeat until task is complete

The agent sees both text (DOM tree with element IDs) and visual context (screenshot with labeled bounding boxes) for more accurate understanding.

Step-by-step mode - Same as high-level but you control the loop:

agent.set_task(description) - Set the task
agent.run_step() - Execute one step (propose → execute)
Setting a new task automatically resets history

Low-level mode - You call methods directly:

agent.navigate(url) - Go to a page
agent.select(description) - Find element by natural language
element.click(), element.fill(text), element.type(text) - Interact with elements
agent.wait(seconds) - Wait for specific duration
agent.wait_for_idle() - Wait for network/DOM to stabilize
agent.screenshot(path) - Capture page screenshot

All modes use the same core: LLM sees cleaned DOM representation plus screenshots with bounding boxes for accurate understanding. No CSS selectors, no XPath - just natural language.

Installation

pip install pywebtask
playwright install chromium

Set up your API key:

export GEMINI_API_KEY="your-api-key"  # or OPENAI_API_KEY

Documentation

📚 Full Documentation

Getting Started - Installation and first steps
Examples - Complete code examples
API Reference - Detailed API documentation
Architecture - How it works internally

Benchmarks

Evaluate webtask on standard web agent benchmarks:

webtask-benchmarks - Evaluation framework for Mind2Web and other benchmarks

Contributing

See TODO.md for planned features and improvements.

Contributions welcome! Open an issue or submit a PR.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

stevewang2000

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.27.0

Dec 13, 2025

0.26.0

Dec 6, 2025

0.25.1

Dec 6, 2025

0.24.0

Dec 6, 2025

0.23.2

Dec 5, 2025

0.23.1

Dec 3, 2025

0.23.0

Dec 3, 2025

0.22.1

Dec 3, 2025

0.22.0

Dec 3, 2025

0.21.4

Dec 3, 2025

0.21.3

Dec 3, 2025

0.21.2

Dec 3, 2025

0.21.0

Dec 3, 2025

0.20.0

Nov 23, 2025

0.19.2

Nov 23, 2025

0.19.1

Nov 23, 2025

0.19.0

Nov 21, 2025

0.18.0

Nov 21, 2025

0.17.7

Nov 20, 2025

0.17.6

Nov 20, 2025

0.17.5

Nov 20, 2025

0.17.4

Nov 20, 2025

0.17.3

Nov 20, 2025

0.17.2

Nov 20, 2025

0.17.1

Nov 20, 2025

0.16.0

Nov 19, 2025

0.15.4

Nov 17, 2025

0.15.3

Nov 16, 2025

0.15.2

Nov 16, 2025

0.15.1

Nov 16, 2025

0.15.0

Nov 16, 2025

0.14.0

Nov 15, 2025

0.13.0

Nov 14, 2025

0.12.3

Nov 10, 2025

0.12.2

Nov 10, 2025

0.12.1

Nov 10, 2025

This version

0.12.0

Nov 10, 2025

0.11.0

Nov 9, 2025

0.10.0

Nov 9, 2025

0.9.6

Nov 5, 2025

0.9.5

Nov 5, 2025

0.9.4

Nov 5, 2025

0.9.3

Nov 5, 2025

0.9.2

Nov 4, 2025

0.9.0

Nov 4, 2025

0.8.11

Nov 4, 2025

0.8.10

Nov 4, 2025

0.8.9

Nov 4, 2025

0.8.8

Nov 4, 2025

0.8.7

Nov 3, 2025

0.8.5

Nov 3, 2025

0.8.4

Nov 3, 2025

0.8.3

Nov 3, 2025

0.8.1

Nov 3, 2025

0.8.0

Nov 2, 2025

0.7.0

Nov 1, 2025

0.6.1

Nov 1, 2025

0.6.0

Nov 1, 2025

0.5.2

Oct 31, 2025

0.5.1

Oct 31, 2025

0.5.0

Oct 31, 2025

0.4.1

Oct 29, 2025

0.3.0

Oct 29, 2025

0.2.1

Oct 24, 2025

0.2.0

Oct 23, 2025

0.1.1

Oct 23, 2025

0.1.0

Oct 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pywebtask-0.12.0.tar.gz (57.2 kB view details)

Uploaded Nov 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pywebtask-0.12.0-py3-none-any.whl (91.8 kB view details)

Uploaded Nov 10, 2025 Python 3

File details

Details for the file pywebtask-0.12.0.tar.gz.

File metadata

Download URL: pywebtask-0.12.0.tar.gz
Upload date: Nov 10, 2025
Size: 57.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pywebtask-0.12.0.tar.gz
Algorithm	Hash digest
SHA256	`ac50276650844c0996f17b8c05cd46f1ca564eedeabc1cce0c5cc0d255157faa`
MD5	`596dcb5343b13e6ac1ef1f61d033f700`
BLAKE2b-256	`0656c6a78c773084bb3917252bfdd0d6aa01e93eb2907aa9a2b6ea7d21fedbb2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pywebtask-0.12.0.tar.gz:

Publisher: publish.yml on steve-z-wang/webtask

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pywebtask-0.12.0.tar.gz
- Subject digest: ac50276650844c0996f17b8c05cd46f1ca564eedeabc1cce0c5cc0d255157faa
- Sigstore transparency entry: 685929200
- Sigstore integration time: Nov 10, 2025
Source repository:
- Permalink: steve-z-wang/webtask@4c5a3776aacd5750677de7a1bc70e4d72d79beb3
- Branch / Tag: refs/heads/main
- Owner: https://github.com/steve-z-wang
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4c5a3776aacd5750677de7a1bc70e4d72d79beb3
- Trigger Event: push

File details

Details for the file pywebtask-0.12.0-py3-none-any.whl.

File metadata

Download URL: pywebtask-0.12.0-py3-none-any.whl
Upload date: Nov 10, 2025
Size: 91.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pywebtask-0.12.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5c5b242267770091e2cd1a0fd0eb76ccdd980f017c7d055b87deb9d41f5f68d7`
MD5	`a2e9be907e29116cf8046617b655eb63`
BLAKE2b-256	`065e6475bb663241ad7de8e317102449daeb9db883c68424cd4ee36f902643f6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pywebtask-0.12.0-py3-none-any.whl:

Publisher: publish.yml on steve-z-wang/webtask

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pywebtask-0.12.0-py3-none-any.whl
- Subject digest: 5c5b242267770091e2cd1a0fd0eb76ccdd980f017c7d055b87deb9d41f5f68d7
- Sigstore transparency entry: 685929201
- Sigstore integration time: Nov 10, 2025
Source repository:
- Permalink: steve-z-wang/webtask@4c5a3776aacd5750677de7a1bc70e4d72d79beb3
- Branch / Tag: refs/heads/main
- Owner: https://github.com/steve-z-wang
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4c5a3776aacd5750677de7a1bc70e4d72d79beb3
- Trigger Event: push

pywebtask 0.12.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

webtask

What it does

Quick look

How it works

Installation

Documentation

Benchmarks

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance