Skip to main content

OmniScout CLI (harness): local-first browser automation, semantic search, and research for AI agents

Project description

OmniScout CLI

Local-first browser automation, semantic search, and research for AI agents. No cloud APIs, no hosted browser sessions, no MCP yet, no SDK.

The CLI is the interface.

Install

Requires Python 3.11+ and Google Chrome (already installed on most macOS machines at /Applications/Google Chrome.app).

Recommended: install as a global tool

pip install omniscout            # verifies Chrome + prefetches embedding model

After this, scout works from any directory. Edits to source files are picked up live (editable install).

omniscout remain available as compatibility aliases.

If you don't have Chrome installed, add --bundled to also download Playwright's bundled Chromium (~190MB).

scout install also prefetches the local sentence-transformers model into OmniScout's app data directory so later commands do not need to fetch it again. Use --no-model to skip model prefetch.

Quickstart

# Search the web (DuckDuckGo HTML + local embedding rerank)
scout search "local-first browser agents"
# same command via alias:
scout search "local-first browser agents"

# Extract a URL to clean Markdown
scout extract https://example.com

# Capture a screenshot of a real page using your installed Chrome
scout browser screenshot https://example.com --out page.png

# Run a multi-step research pipeline (search -> crawl -> extract -> rerank -> summarize)
scout research "state of local AI agents in 2026"

# Manage persistent browser profiles (cookies, logins persist across runs)
scout profile create work
scout browser open https://news.ycombinator.com --profile work --headful

# Long-lived browser sessions (other tools can attach via CDP)
scout session start --headful
scout session list
scout session kill --all

JSON output (for agents)

Every command emits structured JSON when invoked with --json (or with OMNISCOUT_JSON=1 in the environment). Logs always go to stderr; stdout is reserved for the structured result.

OMNISCOUT_JSON=1 scout search "robotics simulators" --limit 5

Architecture

omniscout/
  app.py              # Typer root
  commands/           # CLI sub-commands (thin)
  engines/
    browser.py        # Playwright + system Chrome
    extractor.py      # trafilatura + markdownify
    crawler.py        # async httpx + Chrome fallback
    search/
      ddg.py          # DuckDuckGo HTML
      embed.py        # sentence-transformers (all-MiniLM-L6-v2)
      index.py        # embedded Qdrant on-disk
      rerank.py       # cosine rerank
      pipeline.py     # ddg | index | hybrid
    research.py       # full pipeline (search -> crawl -> extract -> rerank -> summarize)
  store/
    cache.py          # SQLite + content-hashed HTML cache
    sessions.py       # SQLite registry of browser sessions
  models.py           # pydantic result types (the JSON contract)

On-disk state lives under ~/Library/Application Support/omniscout/ (macOS) / $XDG_DATA_HOME/omniscout/ (Linux):

Path Purpose
profiles/ Persistent Chrome user-data-dirs
qdrant/ Embedded vector index
sessions.sqlite Registry of long-lived browser sessions
cache/pages/ Content-hashed HTML cache used by extract+crawler

Override via OMNISCOUT_DATA_DIR, OMNISCOUT_CONFIG_DIR, OMNISCOUT_CACHE_DIR, or settings in ~/Library/Application Support/omniscout/config.toml.

Configuration

config.toml example:

default_source = "ddg"           # search source default
search_limit = 10
research_results = 8
request_throttle_seconds = 1.0   # per-host throttle in the crawler
embedding_model = "sentence-transformers/all-MiniLM-L6-v2"
embedding_local_only = true         # default; never fetch model files at query time
browser_channel = "chrome"       # uses installed Google Chrome
# browser_executable = "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
summary_sentences = 6

Set OMNISCOUT_EMBED_LOCAL_ONLY=0 to allow runtime Hugging Face fetches.

\

Why local Chrome?

Using your system Chrome (channel = "chrome") gives you:

  • Real cookies, login state, extensions, and font rendering
  • No extra ~190MB Chromium download
  • The same user-agent fingerprint as your daily browsing
  • Cleaner integration with omniscout session start for long-lived sessions that other tools can attach to over CDP

If Chrome isn't available, the engine transparently falls back to Playwright's bundled Chromium.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniscout-0.2.0.tar.gz (89.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omniscout-0.2.0-py3-none-any.whl (117.9 kB view details)

Uploaded Python 3

File details

Details for the file omniscout-0.2.0.tar.gz.

File metadata

  • Download URL: omniscout-0.2.0.tar.gz
  • Upload date:
  • Size: 89.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniscout-0.2.0.tar.gz
Algorithm Hash digest
SHA256 fbecdb3e8e59d2984e895da2ae8afc8b9a3eab88a3026378430ff733eccea639
MD5 fa5a35f90a52afd297789bbf4e8628cd
BLAKE2b-256 bb90e3ce51e5bc23e8a4e923cbe566fe12aaf078bb11f2b14983416ac8dafc1a

See more details on using hashes here.

Provenance

The following attestation bundles were made for omniscout-0.2.0.tar.gz:

Publisher: pypi-publish.yml on sriramramnath/omniscout

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file omniscout-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: omniscout-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 117.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniscout-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 32601d9d1c35df0ae28653f48a30f93f5c85fd967ff4fb98d338b1161c4686d5
MD5 cd992bd7ac82ae990f06dd57279c16bb
BLAKE2b-256 8137730caef7c1034916bc570efb305496e4d40ffc435228173154a4d6c0b0e1

See more details on using hashes here.

Provenance

The following attestation bundles were made for omniscout-0.2.0-py3-none-any.whl:

Publisher: pypi-publish.yml on sriramramnath/omniscout

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page