Vision first browser agents based on Websight-7B, a custom model
Project description
Websight
Minimal Python package for calling the Websight VLM and a thin browser Agent.
Layout
src/websight/
__init__.py
agent/
agent.py # Agent(run/execute_action)
browser.py # Playwright wrapper
model/
websight.py # websight_call(prompt, history, image)
actions.py # Action + parse_action
prompts.py # system prompt for websight_call
llm.py # simple OpenRouter LLM helpers
scripts/
manual_image_demo.py
eval/
showdown/...
tests/
Install (editable) and run tests
uv run --frozen -- python -V # ensure Python is available
PYTHONPATH=src uv run --group test pytest -q tests
Quickstart
Programmatic use:
from websight import websight_call
# image_base64 may include the 'data:image/png;base64,' prefix or raw base64
action = websight_call(
prompt="Click the Login button",
history=[], # list of (reasoning, action_str) pairs from prior steps
image_base64="data:image/png;base64,<...>",
)
print(action.action, action.args)
Agent (with a real browser via Playwright):
PYTHONPATH=src uv run python websight.py --task "Go to https://example.com and click More" --show-browser
Manual image demo (no browser):
PYTHONPATH=src uv run python scripts/manual_image_demo.py \
--image data/showdown_clicks/images/0b1c958b929acdbf.png \
--max-new-tokens 512
Environment
Web requests to LLMs use OpenRouter. Set:
export OPENROUTER_API_KEY=...
The Websight model will be loaded from Hugging Face via transformers pipeline: tanvirb/websight-7B.
Packaging (src layout)
This repository is configured for a src layout with setuptools.
[build-system]
requires = ["setuptools>=68", "wheel"]
build-backend = "setuptools.build_meta"
[tool.setuptools]
packages = {find = {where = ["src"]}}
Build locally (artifacts in dist/):
uv build
Do not publish yet. When ready, you can publish with uv publish after setting the appropriate token.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file websight-0.1.0.tar.gz.
File metadata
- Download URL: websight-0.1.0.tar.gz
- Upload date:
- Size: 9.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a819d4529b4dd02ca501e9e54e37b72029c43e587ae679fb069132ad798f444b
|
|
| MD5 |
32f24ae3d7412e100d4037dc2c12e07b
|
|
| BLAKE2b-256 |
8202fed1407b6102b7d08477260a4e32872904336de41496406ef6a403c501c8
|
File details
Details for the file websight-0.1.0-py3-none-any.whl.
File metadata
- Download URL: websight-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0f1506a0fed9049ff2fede8085263101629d2f2b13e85f1bdb66f5776095755
|
|
| MD5 |
2458ea51428cfa7171c0a1b786589791
|
|
| BLAKE2b-256 |
79133372e08a9a54b18cd19b6152c7a07bd91d652988e09d50da52657eb7d6db
|