Deterministic semantic runtime over live Chromium for LLM planners
Project description
Semantic Browser
Version 1.3.1 (Beta) · PyPI · Changelog · License: MIT
Semantic Browser turns live Chromium pages into compact semantic "rooms" for LLM planners. The planner sees a text-adventure description of the page, picks one action ID, and the runtime executes it deterministically.
@ BBC News (bbc.co.uk)
> Home page. Main content: "Top stories". Navigation: News, Sport, Weather.
! Cookie consent banner detected -> dismiss [act-a1b2c3d4-0]
1 open "News" [act-8f2a2d1c-0]
2 open "Sport" [act-c3e119fa-0]
3 fill Search BBC [act-0b9411de-0] *value
+ 28 more [more]
Less confusion, less hallucination, dramatically less cost.
Why Semantic Browser
- Plain-text room descriptions — prose, not JSON soup.
- Curated action surface — top 25 actions, with
morefor progressive disclosure. - Deterministic execution —
observe → act → observe delta, every time. - Built-in blockers — cookie banners, modals, and anti-bot gates are detected and signaled.
- Token-efficient — median planner input of ~540 tokens vs ~10,000 for standard tooling.
- Three interfaces — Python API, CLI, and HTTP service.
Install
pip install "semantic-browser[managed]"
semantic-browser install-browser
For service mode: pip install "semantic-browser[server]"
Quickstart
Interactive portal
semantic-browser portal --url https://example.com --headless
Python
import asyncio
from semantic_browser import ManagedSession
from semantic_browser.models import ActionRequest
async def main() -> None:
session = await ManagedSession.launch(headful=False)
runtime = session.runtime
await runtime.navigate("https://example.com")
obs = await runtime.observe(mode="summary")
print(obs.planner.room_text)
first_link = next((a for a in obs.available_actions if a.op == "open"), None)
if first_link:
result = await runtime.act(ActionRequest(action_id=first_link.id))
print(result.status, result.observation.page.url)
await session.close()
asyncio.run(main())
LLM Agent Loop (Minimal)
async def agent_loop(url: str, task: str) -> None:
session = await ManagedSession.launch(headful=False)
runtime = session.runtime
await runtime.navigate(url)
obs = await runtime.observe(mode="summary")
for step in range(25):
action_id = call_your_llm(obs.planner.room_text, task) # returns one action ID
if action_id == "done":
break
result = await runtime.act(ActionRequest(action_id=action_id))
obs = result.observation
await session.close()
Full worked examples for OpenAI, Anthropic, and more: Integration Examples
Documentation
| Document | What it covers |
|---|---|
| Getting Started | Install, first run, interactive portal, Python/CLI/service quickstarts |
| Planner Contract | The exact interface between Semantic Browser and an LLM planner — what the planner receives, what it should reply, how to handle blockers, failures, and stopping |
| Integration Examples | End-to-end examples: OpenAI chat, OpenAI function-calling, Anthropic tool use, HTTP service, CDP attach, error handling patterns |
| API Reference | Every public class, method, model, and field — ManagedSession, SemanticBrowserRuntime, Observation, StepResult, ActionDescriptor, configuration, errors |
| Runtime Modes | Decision table for ephemeral/persistent/clone/attach/service modes, headful vs headless, ownership semantics |
| Real Profiles | Using real Chromium profiles for login persistence, SSO, clone mode, safety guarantees, common pitfalls |
| Benchmark Protocol | How benchmark numbers are produced and validated |
| Versioning | Version numbering scheme |
| Publishing | PyPI publish checklist |
| Changelog | Full release history |
How It Works
Live page → extract semantic tree → group into regions → curate actions → render room text
↓
LLM planner picks action ID
↓
runtime resolves & executes
↓
observe delta → repeat
- Observe — the runtime extracts the page's semantic structure, groups it into regions, curates the top actions, and renders a text-adventure "room".
- Plan — the LLM planner reads the room text and replies with one action ID.
- Act — the runtime resolves the action to a DOM element, executes it, waits for the page to settle, and produces a delta observation.
- Repeat — the planner sees the delta and picks the next action.
Benchmarks
Cross-method comparison on a shared 25-task pack:
| Method | Success | Median planner input (tokens) | Median planner output (tokens) | Indicative cost/request (USD) |
|---|---|---|---|---|
| Standard browser tooling | 24% (6/25) | 10,118 | 74 | $0.041 |
| OpenClaw browser tooling | 72% (18/25) | 6,833 | 66 | $0.022 |
| Semantic Browser | 100% (25/25) | 540 | 14 | $0.004 |
At 5 tasks/day over a year: ~$75/year standard vs ~$7/year Semantic Browser.
These are reference harness results, not universal guarantees. Protocol: docs/benchmark_protocol.md. Manifest: benchmarks/manifest.json.
CLI Reference
semantic-browser version # Show version
semantic-browser doctor # Verify installation
semantic-browser install-browser # Download Chromium
semantic-browser launch --headless # Start a session
semantic-browser attach --cdp <ws-url> # Attach to running Chrome
semantic-browser portal --url <url> # Interactive exploration REPL
semantic-browser observe --session <id> --mode summary
semantic-browser act --session <id> --action <action_id>
semantic-browser inspect --session <id> --target <target_id>
semantic-browser navigate --session <id> --url <url>
semantic-browser back --session <id>
semantic-browser forward --session <id>
semantic-browser reload --session <id>
semantic-browser diagnostics --session <id>
semantic-browser export-trace --session <id> --out trace.json
semantic-browser serve --host 127.0.0.1 --port 8765 --api-token <token>
What's New in v1.3.0
- Framework-agnostic element discovery — AngularJS, Vue, Alpine.js, and custom elements discovered automatically.
- Fuzzy structural settle — live-updating pages (odds, tickers, chat) no longer cause timeouts.
- Stable fingerprints — action IDs use DOM id + CSS selector, not pixel position.
- Smarter locator resolution — volatile framework classes stripped from selectors.
- Robust modal detection — three-tier detection with visibility and size checks.
- Increased budgets — 25 curated actions, 2K room budget, 4K max elements.
- SPA navigation awareness — URL changes during settle are handled correctly.
Full details: CHANGELOG.md
Contributing
See CONTRIBUTING.md for development setup and PR expectations.
License
MIT — see LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file semantic_browser-1.3.1.tar.gz.
File metadata
- Download URL: semantic_browser-1.3.1.tar.gz
- Upload date:
- Size: 102.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76b33e7e3fffdaf24a55ce2d23f3dc01bc4c406d3be2ef0c0502e487763c8682
|
|
| MD5 |
902066a0adf85f67974599c5a4360054
|
|
| BLAKE2b-256 |
75743ca158bbfe3e539938da3df9506e4132eb2b44948fe7c18be3588dc3fca2
|
File details
Details for the file semantic_browser-1.3.1-py3-none-any.whl.
File metadata
- Download URL: semantic_browser-1.3.1-py3-none-any.whl
- Upload date:
- Size: 56.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f34ab204dbd52606ffdf463808029f515f298a77a34e5aafc1a5a664ae01861f
|
|
| MD5 |
74d39a5eb5d18ce52b38e9b909e16fd7
|
|
| BLAKE2b-256 |
17fb4911a3a35ac26e1bc05d4370dc2e587e03c660709cce27244ab7410f60b3
|