Skip to main content

Deterministic semantic runtime over live Chromium for LLM planners

Project description

Semantic Browser

Semantic Browser mascot

Version 1.3.2 (Beta) · PyPI · Changelog · License: MIT

Semantic Browser turns live Chromium pages into compact semantic "rooms" for LLM planners. The planner sees a text-adventure description of the page, picks one action ID, and the runtime executes it deterministically.

@ BBC News (bbc.co.uk)
> Home page. Main content: "Top stories". Navigation: News, Sport, Weather.
! Cookie consent banner detected -> dismiss [act-a1b2c3d4-0]
1 open "News" [act-8f2a2d1c-0]
2 open "Sport" [act-c3e119fa-0]
3 fill Search BBC [act-0b9411de-0] *value
+ 28 more [more]

Less confusion, less hallucination, dramatically less cost.

Why Semantic Browser

  • Plain-text room descriptions — prose, not JSON soup.
  • Curated action surface — top 25 actions, with more for progressive disclosure.
  • Deterministic executionobserve → act → observe delta, every time.
  • Built-in blockers — cookie banners, modals, and anti-bot gates are detected and signaled.
  • Token-efficient — median planner input of ~540 tokens vs ~10,000 for standard tooling.
  • Three interfaces — Python API, CLI, and HTTP service.

Install

pip install "semantic-browser[managed]"
semantic-browser install-browser

For service mode: pip install "semantic-browser[server]"

Quickstart

Interactive portal

semantic-browser portal --url https://example.com --headless

Python

import asyncio
from semantic_browser import ManagedSession
from semantic_browser.models import ActionRequest

async def main() -> None:
    session = await ManagedSession.launch(headful=False)
    runtime = session.runtime

    await runtime.navigate("https://example.com")
    obs = await runtime.observe(mode="summary")
    print(obs.planner.room_text)

    first_link = next((a for a in obs.available_actions if a.op == "open"), None)
    if first_link:
        result = await runtime.act(ActionRequest(action_id=first_link.id))
        print(result.status, result.observation.page.url)

    await session.close()

asyncio.run(main())

LLM Agent Loop (Minimal)

async def agent_loop(url: str, task: str) -> None:
    session = await ManagedSession.launch(headful=False)
    runtime = session.runtime
    await runtime.navigate(url)
    obs = await runtime.observe(mode="summary")

    for step in range(25):
        action_id = call_your_llm(obs.planner.room_text, task)  # returns one action ID
        if action_id == "done":
            break
        result = await runtime.act(ActionRequest(action_id=action_id))
        obs = result.observation

    await session.close()

Full worked examples for OpenAI, Anthropic, and more: Integration Examples

Documentation

Document What it covers
Getting Started Install, first run, interactive portal, Python/CLI/service quickstarts
Planner Contract The exact interface between Semantic Browser and an LLM planner — what the planner receives, what it should reply, how to handle blockers, failures, and stopping
Integration Examples End-to-end examples: OpenAI chat, OpenAI function-calling, Anthropic tool use, HTTP service, CDP attach, error handling patterns
API Reference Every public class, method, model, and field — ManagedSession, SemanticBrowserRuntime, Observation, StepResult, ActionDescriptor, configuration, errors
Runtime Modes Decision table for ephemeral/persistent/clone/attach/service modes, headful vs headless, ownership semantics
Real Profiles Using real Chromium profiles for login persistence, SSO, clone mode, safety guarantees, common pitfalls
Benchmark Protocol How benchmark numbers are produced and validated
Versioning Version numbering scheme
Publishing PyPI publish checklist
Changelog Full release history

How It Works

Live page → extract semantic tree → group into regions → curate actions → render room text
                                                                              ↓
                                                              LLM planner picks action ID
                                                                              ↓
                                                              runtime resolves & executes
                                                                              ↓
                                                              observe delta → repeat
  1. Observe — the runtime extracts the page's semantic structure, groups it into regions, curates the top actions, and renders a text-adventure "room".
  2. Plan — the LLM planner reads the room text and replies with one action ID.
  3. Act — the runtime resolves the action to a DOM element, executes it, waits for the page to settle, and produces a delta observation.
  4. Repeat — the planner sees the delta and picks the next action.

Benchmarks

Cross-method comparison on a shared 25-task pack:

Method Success Median planner input (tokens) Median planner output (tokens) Indicative cost/request (USD)
Standard browser tooling 24% (6/25) 10,118 74 $0.041
OpenClaw browser tooling 72% (18/25) 6,833 66 $0.022
Semantic Browser 100% (25/25) 540 14 $0.004

At 5 tasks/day over a year: ~$75/year standard vs ~$7/year Semantic Browser.

These are reference harness results, not universal guarantees. Protocol: docs/benchmark_protocol.md. Manifest: benchmarks/manifest.json.

CLI Reference

semantic-browser version                # Show version
semantic-browser doctor                 # Verify installation
semantic-browser install-browser        # Download Chromium
semantic-browser launch --headless      # Start a session
semantic-browser attach --cdp <ws-url>  # Attach to running Chrome
semantic-browser portal --url <url>     # Interactive exploration REPL
semantic-browser observe --session <id> --mode summary
semantic-browser act --session <id> --action <action_id>
semantic-browser inspect --session <id> --target <target_id>
semantic-browser navigate --session <id> --url <url>
semantic-browser back --session <id>
semantic-browser forward --session <id>
semantic-browser reload --session <id>
semantic-browser diagnostics --session <id>
semantic-browser export-trace --session <id> --out trace.json
semantic-browser serve --host 127.0.0.1 --port 8765 --api-token <token>

What's New in v1.3.2

  • Added GET /health endpoint — unauthenticated liveness probe for orchestrators and watchdogs.
  • Health payload includes release + runtime signal — returns {status, version, active_sessions}.
  • Service internals hardened — endpoint now uses public registry API instead of private field access.
  • Service docs corrected — diagnostics endpoint method fixed to GET in HTTP reference.

Full details: CHANGELOG.md

Contributing

See CONTRIBUTING.md for development setup and PR expectations.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semantic_browser-1.3.2.tar.gz (103.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semantic_browser-1.3.2-py3-none-any.whl (56.7 kB view details)

Uploaded Python 3

File details

Details for the file semantic_browser-1.3.2.tar.gz.

File metadata

  • Download URL: semantic_browser-1.3.2.tar.gz
  • Upload date:
  • Size: 103.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for semantic_browser-1.3.2.tar.gz
Algorithm Hash digest
SHA256 1791ffeffa2c03d8ccdd1b52b60c6d31175dfd761d4e9d1051571af4998bd004
MD5 9ac5d40ce8167e3d5341b84bfe223c60
BLAKE2b-256 2d95cfd7ed158ced9a1a7a14ed9b6fa0e13afce37a2bde61dd6649fb06dd8e51

See more details on using hashes here.

File details

Details for the file semantic_browser-1.3.2-py3-none-any.whl.

File metadata

File hashes

Hashes for semantic_browser-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 83ece19238f9ddbf22526a6d3cc6215a920d100ac17fd528790b021a806c1182
MD5 a27a54b96f2748bc224995d00783e29b
BLAKE2b-256 44948b95271ca36f9a6cb64943ec90388059b087c3df1b27223bed6f841e7c1f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page