Skip to main content

Heuristic Yield Websearch - LLM-powered web search assistant

Project description

HYW — Heuristic Yield Websearch

An LLM-powered terminal assistant that searches, cross-validates, then answers.

Python License

English · 中文


Why

LLMs have a knowledge cutoff. Ask "what happened today" and they can only guess.

HYW lets the model decide what to search, how many rounds, and how to cross-validate — then gives you the answer. Not a simple "search and paste" — it's a multi-round heuristic search loop.

Features

  • Multi-round autonomous search — The model breaks down questions, crafts search queries, and validates results across up to 6 iterations
  • XML tag tool calling — No function calling dependency; works with any LLM provider
  • Streaming output — Think and display in real-time; search progress visible as it happens
  • Pluggable tool backends — Search / page extract / render are selected by capability, not hard-coded per module
  • Built-in websearch service — Ships with ddgs, Jina AI search/page extraction, and non-browser Markdown render
  • Rich terminal UI — Gradient titles, Markdown rendering, live spinners
  • Multi-turn conversation — Context auto-carried; toggle mode with arrow keys
  • Any model via LiteLLM — OpenAI / Anthropic / Google / OpenRouter / local models

Quickstart

# Default install: CLI + ddgs + Jina AI + md2png-lite render
pip install hyw

# Add entari plugin support
pip install "hyw[entari]"

# Add Entari + Noto font sync support
pip install "hyw[entari,notosans]"

# Interactive mode
hyw

# Single question
hyw -q "What's the latest in tech news?"

The hyw command is available in the default install.

Configuration

Config file: ~/.hyw/config.yml. Use /config in interactive mode to edit directly. An example based on the multi-model layout lives at config.example.yml. In interactive mode, ← / → switches models, and ↑ / ↓ toggles multi-turn vs new session. Legacy single-model fields (model / api_key / api_base) still work. You can also define named transport presets via model_provider / model_providers for OpenAI-compatible relays.

# Shared provider defaults.
# `models[*]` and `sub_agent.*` inherit these unless they override them.
api_key: sk-or-xxx
api_base: https://openrouter.ai/api/v1

# Optional LiteLLM transport preset.
# `requires_openai_auth: true` means "use OPENAI_API_KEY if api_key is omitted".
# model_provider: mirror
# model_providers:
#   mirror:
#     base_url: https://chat.soruxgpt.com/codex
#     wire_api: responses
#     requires_openai_auth: true
#     custom_llm_provider: openai

# Main controller model used at startup / single-shot mode.
# You can set this to either a profile `name` or a raw model id.
active_model: gemini-lite

models:
  - name: gemini-lite
    model: openrouter/google/gemini-3.1-flash-lite-preview
  - name: kimi-k2.5
    model: openrouter/moonshotai/kimi-k2.5
  - name: cerebras-gpt-oss
    model: cerebras/gpt-oss-120b
    api_key: csk-xxx
    api_base: https://api.cerebras.ai/v1

# Runtime options actually used by the app
language: zh-CN
# Set `false` if the upstream provider has streaming / tool-call compatibility issues.
stream: true
headless: true
# Maximum main-loop rounds. Default is 8.
max_rounds: 8
# Custom system prompt appended to the main controller prompt.
system_prompt: ""

# Child model overrides. Leave empty to inherit the active main model config.
sub_agent:
  websearch:
    model: ""
  page:
    model: ""
  vision:
    model: ""

# Tool capability registry + default provider selection
tools:
  index:
    ddgs:
      search: core.search_ddgs:ddgs_search
    jina_ai:
      search: core.search_jina_ai:jina_ai_search
      page_extract: core.search_jina_ai:jina_ai_page_extract
    md2png_lite:
      render: md2png_lite.provider:render_md2png_lite_result
  config:
    jina_ai:
      page_extract:
        prefer_free: true
  use:
    search: ddgs
    page_extract: jina_ai
    render: md2png_lite

# Legacy stage-specific model slots kept only for compatibility.
# The current main loop does not read them.
stages:
  search:
    model: ""
  fetch:
    model: ""
  summary:
    model: ""

What each block does now:

  • api_key / api_base: shared defaults inherited by models[*] and sub_agent.*.
  • model_provider / model_providers: named transport presets that expand into LiteLLM fields such as api_base, custom_llm_provider, and api_key_env.
  • active_model: the main controller model currently selected; can match either a profile name or a raw model id.
  • models: switchable main-model profiles for CLI left/right model selection.
  • language / stream / headless / system_prompt: active runtime options used by the current flow.
  • max_rounds: maximum main-loop rounds; default is 8.
  • sub_agent.websearch: child model that generates 2-6 search terms and runs internal search.
  • sub_agent.page: child model that compresses pages and extracts evidence.
  • sub_agent.vision: image understanding helper kept for independent vision flows; not part of the main search loop.
  • tools.index: capability-to-provider registry.
  • tools.config: per-provider extra options such as headers or free-route preferences.
  • tools.use: which provider is selected by default for each capability.
  • stages.*: legacy stage-specific model slots kept only for old configs; the current main loop does not use them.

How It Works

User Question
  │
  ▼
┌─────────────────────────────────────┐
│  Main model plans the next step     │
│  Outputs <sub_agent ...> XML tags   │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│  websearch sub-agent builds 2-6     │
│  queries and runs internal search   │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│  Main model chooses concrete pages  │
│  page sub-agent compresses them     │
│  then main model returns the answer │
└─────────────────────────────────────┘

Commands

Command Description
/config Open config file in editor
/stats Show session statistics
/exit Exit
/ Switch active model
/ Toggle Multi Turn / New Session mode

Project Structure

core/
├── config.py               # Model config + tool capability registry
├── main.py                 # Conversation loop, tool calls, LLM interaction
├── cli.py                  # Rich terminal UI, streaming output
├── __main__.py             # python -m core entry point
├── search_ddgs.py          # DDGS search provider
├── search_jina_ai.py       # Jina AI search + page extract provider
├── web_search.py           # WebToolSuite + service runtime
└── render.py               # md2png-lite render dispatch

Requirements

  • Python ≥ 3.12
  • Default deps: litellm · pyyaml · loguru · rich · prompt-toolkit · ddgs · httpx · md2png-lite · Pillow
  • entari: arclet-alconna · arclet-entari · md2png-lite
  • notosans: md2png-lite[notosans]

Roadmap

  • ...
  • ...
  • ...

Contributing

Issues and PRs welcome.

License

MIT


Built with curiosity and caffeine.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyw-0.0.3.tar.gz (346.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hyw-0.0.3-py3-none-any.whl (99.3 kB view details)

Uploaded Python 3

File details

Details for the file hyw-0.0.3.tar.gz.

File metadata

  • Download URL: hyw-0.0.3.tar.gz
  • Upload date:
  • Size: 346.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hyw-0.0.3.tar.gz
Algorithm Hash digest
SHA256 875e5ae0220175c11b66ec3b4194932530316117aac9907d41a7f541b06ae9d2
MD5 25d4c541b3431d483ce1ecc5cea17d6f
BLAKE2b-256 074fe439604c7c2e95efacb4d0229d22abdc3771b2463dd9709bf0ea6d1443ba

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyw-0.0.3.tar.gz:

Publisher: workflow.yml on kumoSleeping/heuristic_yield_websearch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hyw-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: hyw-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 99.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hyw-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5239f0f0486947e45c81c1b21862da4b1da3010f18f1b29b719b3150c9c9da1d
MD5 b436c2dfb5de3a81da1480d4adae5c81
BLAKE2b-256 f4da45aad1e299de4ada48157fb0b09384a992601c927382783e0e5bcf8d1854

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyw-0.0.3-py3-none-any.whl:

Publisher: workflow.yml on kumoSleeping/heuristic_yield_websearch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page