Skip to main content

OmniScout CLI (harness): local-first browser automation, semantic search, and research for AI agents

Project description

OmniScout CLI

Local-first browser automation, semantic search, and research for AI agents. No cloud APIs, no hosted browser sessions, no MCP yet, no SDK.

The CLI is the interface.

Install

Requires Python 3.11+ and Google Chrome (already installed on most macOS machines at /Applications/Google Chrome.app).

Recommended: install as a global tool

pip install omniscout            # verifies Chrome + prefetches embedding model

After this, scout works from any directory. Edits to source files are picked up live (editable install).

omniscout remain available as compatibility aliases.

If you don't have Chrome installed, add --bundled to also download Playwright's bundled Chromium (~190MB).

scout install also prefetches the local sentence-transformers model into OmniScout's app data directory so later commands do not need to fetch it again. Use --no-model to skip model prefetch.

Quickstart

# Search the web (DuckDuckGo HTML + local embedding rerank)
scout search "local-first browser agents"
# same command via alias:
scout search "local-first browser agents"

# Extract a URL to clean Markdown
scout extract https://example.com

# Capture a screenshot of a real page using your installed Chrome
scout browser screenshot https://example.com --out page.png

# Run a multi-step research pipeline (search -> crawl -> extract -> rerank -> summarize)
scout research "state of local AI agents in 2026"

# Manage persistent browser profiles (cookies, logins persist across runs)
scout profile create work
scout browser open https://news.ycombinator.com --profile work --headful

# Long-lived browser sessions (other tools can attach via CDP)
scout session start --headful
scout session list
scout session kill --all

JSON output (for agents)

Every command emits structured JSON when invoked with --json (or with OMNISCOUT_JSON=1 in the environment). Logs always go to stderr; stdout is reserved for the structured result.

OMNISCOUT_JSON=1 scout search "robotics simulators" --limit 5

Architecture

omniscout/
  app.py              # Typer root
  commands/           # CLI sub-commands (thin)
  engines/
    browser.py        # Playwright + system Chrome
    extractor.py      # trafilatura + markdownify
    crawler.py        # async httpx + Chrome fallback
    search/
      ddg.py          # DuckDuckGo HTML
      embed.py        # sentence-transformers (all-MiniLM-L6-v2)
      index.py        # embedded Qdrant on-disk
      rerank.py       # cosine rerank
      pipeline.py     # ddg | index | hybrid
    research.py       # full pipeline (search -> crawl -> extract -> rerank -> summarize)
  store/
    cache.py          # SQLite + content-hashed HTML cache
    sessions.py       # SQLite registry of browser sessions
  models.py           # pydantic result types (the JSON contract)

On-disk state lives under ~/Library/Application Support/omniscout/ (macOS) / $XDG_DATA_HOME/omniscout/ (Linux):

Path Purpose
profiles/ Persistent Chrome user-data-dirs
qdrant/ Embedded vector index
sessions.sqlite Registry of long-lived browser sessions
cache/pages/ Content-hashed HTML cache used by extract+crawler

Override via OMNISCOUT_DATA_DIR, OMNISCOUT_CONFIG_DIR, OMNISCOUT_CACHE_DIR, or settings in ~/Library/Application Support/omniscout/config.toml.

Configuration

config.toml example:

default_source = "ddg"           # search source default
search_limit = 10
research_results = 8
request_throttle_seconds = 1.0   # per-host throttle in the crawler
embedding_model = "sentence-transformers/all-MiniLM-L6-v2"
embedding_local_only = true         # default; never fetch model files at query time
browser_channel = "chrome"       # uses installed Google Chrome
# browser_executable = "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
summary_sentences = 6

Set OMNISCOUT_EMBED_LOCAL_ONLY=0 to allow runtime Hugging Face fetches.

\

Why local Chrome?

Using your system Chrome (channel = "chrome") gives you:

  • Real cookies, login state, extensions, and font rendering
  • No extra ~190MB Chromium download
  • The same user-agent fingerprint as your daily browsing
  • Cleaner integration with omniscout session start for long-lived sessions that other tools can attach to over CDP

If Chrome isn't available, the engine transparently falls back to Playwright's bundled Chromium.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniscout-0.2.1.tar.gz (104.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omniscout-0.2.1-py3-none-any.whl (134.1 kB view details)

Uploaded Python 3

File details

Details for the file omniscout-0.2.1.tar.gz.

File metadata

  • Download URL: omniscout-0.2.1.tar.gz
  • Upload date:
  • Size: 104.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniscout-0.2.1.tar.gz
Algorithm Hash digest
SHA256 0f96a43e46c193325d54eba012d16bff97aa8784dbe6d4f12b59e9c82772d5e5
MD5 0c3094b9477afec3eff64582b5e3da13
BLAKE2b-256 1f050f827b2d5cf095c4a79d9176162760c2af01cd4d0055edd1a8ec79e07ece

See more details on using hashes here.

Provenance

The following attestation bundles were made for omniscout-0.2.1.tar.gz:

Publisher: pypi-publish.yml on sriramramnath/omniscout

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file omniscout-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: omniscout-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 134.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniscout-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b759a7ccd05535cfa9cd371bb5acef8dbe0329e516e42ac9c86402114a2bbf4b
MD5 51144715249a1e2a5e868fddac04cd15
BLAKE2b-256 7690d7aaa51a7140c0e4d27e77cf5473b1294039cc5c75b0a503b08118f50ec2

See more details on using hashes here.

Provenance

The following attestation bundles were made for omniscout-0.2.1-py3-none-any.whl:

Publisher: pypi-publish.yml on sriramramnath/omniscout

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page