Deterministic semantic runtime over live Chromium for LLM planners

Project description

Semantic Browser

Semantic Browser mascot

Version 1.3.1 (Beta) · PyPI · Changelog · License: MIT

Semantic Browser turns live Chromium pages into compact semantic "rooms" for LLM planners. The planner sees a text-adventure description of the page, picks one action ID, and the runtime executes it deterministically.

@ BBC News (bbc.co.uk)
> Home page. Main content: "Top stories". Navigation: News, Sport, Weather.
! Cookie consent banner detected -> dismiss [act-a1b2c3d4-0]
1 open "News" [act-8f2a2d1c-0]
2 open "Sport" [act-c3e119fa-0]
3 fill Search BBC [act-0b9411de-0] *value
+ 28 more [more]

Less confusion, less hallucination, dramatically less cost.

Why Semantic Browser

Plain-text room descriptions — prose, not JSON soup.
Curated action surface — top 25 actions, with more for progressive disclosure.
Deterministic execution — observe → act → observe delta, every time.
Built-in blockers — cookie banners, modals, and anti-bot gates are detected and signaled.
Token-efficient — median planner input of ~540 tokens vs ~10,000 for standard tooling.
Three interfaces — Python API, CLI, and HTTP service.

Install

pip install "semantic-browser[managed]"
semantic-browser install-browser

For service mode: pip install "semantic-browser[server]"

Quickstart

Interactive portal

semantic-browser portal --url https://example.com --headless

Python

import asyncio
from semantic_browser import ManagedSession
from semantic_browser.models import ActionRequest

async def main() -> None:
    session = await ManagedSession.launch(headful=False)
    runtime = session.runtime

    await runtime.navigate("https://example.com")
    obs = await runtime.observe(mode="summary")
    print(obs.planner.room_text)

    first_link = next((a for a in obs.available_actions if a.op == "open"), None)
    if first_link:
        result = await runtime.act(ActionRequest(action_id=first_link.id))
        print(result.status, result.observation.page.url)

    await session.close()

asyncio.run(main())

LLM Agent Loop (Minimal)

async def agent_loop(url: str, task: str) -> None:
    session = await ManagedSession.launch(headful=False)
    runtime = session.runtime
    await runtime.navigate(url)
    obs = await runtime.observe(mode="summary")

    for step in range(25):
        action_id = call_your_llm(obs.planner.room_text, task)  # returns one action ID
        if action_id == "done":
            break
        result = await runtime.act(ActionRequest(action_id=action_id))
        obs = result.observation

    await session.close()

Full worked examples for OpenAI, Anthropic, and more: Integration Examples

Documentation

Document	What it covers
Getting Started	Install, first run, interactive portal, Python/CLI/service quickstarts
Planner Contract	The exact interface between Semantic Browser and an LLM planner — what the planner receives, what it should reply, how to handle blockers, failures, and stopping
Integration Examples	End-to-end examples: OpenAI chat, OpenAI function-calling, Anthropic tool use, HTTP service, CDP attach, error handling patterns
API Reference	Every public class, method, model, and field — `ManagedSession`, `SemanticBrowserRuntime`, `Observation`, `StepResult`, `ActionDescriptor`, configuration, errors
Runtime Modes	Decision table for ephemeral/persistent/clone/attach/service modes, headful vs headless, ownership semantics
Real Profiles	Using real Chromium profiles for login persistence, SSO, clone mode, safety guarantees, common pitfalls
Benchmark Protocol	How benchmark numbers are produced and validated
Versioning	Version numbering scheme
Publishing	PyPI publish checklist
Changelog	Full release history

How It Works

Live page → extract semantic tree → group into regions → curate actions → render room text
                                                                              ↓
                                                              LLM planner picks action ID
                                                                              ↓
                                                              runtime resolves & executes
                                                                              ↓
                                                              observe delta → repeat

Observe — the runtime extracts the page's semantic structure, groups it into regions, curates the top actions, and renders a text-adventure "room".
Plan — the LLM planner reads the room text and replies with one action ID.
Act — the runtime resolves the action to a DOM element, executes it, waits for the page to settle, and produces a delta observation.
Repeat — the planner sees the delta and picks the next action.

Benchmarks

Cross-method comparison on a shared 25-task pack:

Method	Success	Median planner input (tokens)	Median planner output (tokens)	Indicative cost/request (USD)
Standard browser tooling	24% (6/25)	10,118	74	$0.041
OpenClaw browser tooling	72% (18/25)	6,833	66	$0.022
Semantic Browser	100% (25/25)	540	14	$0.004

At 5 tasks/day over a year: ~$75/year standard vs ~$7/year Semantic Browser.

These are reference harness results, not universal guarantees. Protocol: docs/benchmark_protocol.md. Manifest: benchmarks/manifest.json.

CLI Reference

semantic-browser version                # Show version
semantic-browser doctor                 # Verify installation
semantic-browser install-browser        # Download Chromium
semantic-browser launch --headless      # Start a session
semantic-browser attach --cdp <ws-url>  # Attach to running Chrome
semantic-browser portal --url <url>     # Interactive exploration REPL
semantic-browser observe --session <id> --mode summary
semantic-browser act --session <id> --action <action_id>
semantic-browser inspect --session <id> --target <target_id>
semantic-browser navigate --session <id> --url <url>
semantic-browser back --session <id>
semantic-browser forward --session <id>
semantic-browser reload --session <id>
semantic-browser diagnostics --session <id>
semantic-browser export-trace --session <id> --out trace.json
semantic-browser serve --host 127.0.0.1 --port 8765 --api-token <token>

What's New in v1.3.0

Framework-agnostic element discovery — AngularJS, Vue, Alpine.js, and custom elements discovered automatically.
Fuzzy structural settle — live-updating pages (odds, tickers, chat) no longer cause timeouts.
Stable fingerprints — action IDs use DOM id + CSS selector, not pixel position.
Smarter locator resolution — volatile framework classes stripped from selectors.
Robust modal detection — three-tier detection with visibility and size checks.
Increased budgets — 25 curated actions, 2K room budget, 4K max elements.
SPA navigation awareness — URL changes during settle are handled correctly.

Full details: CHANGELOG.md

Contributing

See CONTRIBUTING.md for development setup and PR expectations.

License

MIT — see LICENSE.

Project details

Release history Release notifications | RSS feed

1.3.2

Apr 19, 2026

This version

1.3.1

Apr 16, 2026

1.3.0

Apr 6, 2026

1.2.0

Mar 27, 2026

1.1.0

Mar 12, 2026

1.0.0

Mar 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semantic_browser-1.3.1.tar.gz (102.8 kB view details)

Uploaded Apr 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

semantic_browser-1.3.1-py3-none-any.whl (56.7 kB view details)

Uploaded Apr 16, 2026 Python 3

File details

Details for the file semantic_browser-1.3.1.tar.gz.

File metadata

Download URL: semantic_browser-1.3.1.tar.gz
Upload date: Apr 16, 2026
Size: 102.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for semantic_browser-1.3.1.tar.gz
Algorithm	Hash digest
SHA256	`76b33e7e3fffdaf24a55ce2d23f3dc01bc4c406d3be2ef0c0502e487763c8682`
MD5	`902066a0adf85f67974599c5a4360054`
BLAKE2b-256	`75743ca158bbfe3e539938da3df9506e4132eb2b44948fe7c18be3588dc3fca2`

See more details on using hashes here.

File details

Details for the file semantic_browser-1.3.1-py3-none-any.whl.

File metadata

Download URL: semantic_browser-1.3.1-py3-none-any.whl
Upload date: Apr 16, 2026
Size: 56.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for semantic_browser-1.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f34ab204dbd52606ffdf463808029f515f298a77a34e5aafc1a5a664ae01861f`
MD5	`74d39a5eb5d18ce52b38e9b909e16fd7`
BLAKE2b-256	`17fb4911a3a35ac26e1bc05d4370dc2e587e03c660709cce27244ab7410f60b3`

See more details on using hashes here.

semantic-browser 1.3.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Semantic Browser

Why Semantic Browser

Install

Quickstart

Interactive portal

Python

LLM Agent Loop (Minimal)

Documentation

How It Works

Benchmarks

CLI Reference

What's New in v1.3.0

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes