Give any AI agent the ability to search, crawl, and extract the web.

These details have not been verified by PyPI

Project links

Project description

Agent Search

Give any AI agent the ability to search, crawl, and extract the web.

Agent Search is a CLI and Python library that gives AI agents reliable web access. One command to search, crawl websites, extract structured data, and monitor pages for changes — all routed through a 4-layer proxy chain that automatically handles IP rotation, CAPTCHA detection, and rate limiting.

pip install agentsearchcli
search "latest NVIDIA earnings" --format json

Why Agent Search?

Most AI agents can't reliably access the web. Search APIs are expensive, direct requests get blocked, and scraping requires infrastructure. Agent Search solves this:

Multi-engine search — Aggregates results from Google, DuckDuckGo, Bing, and Wikipedia. Deduplicates and ranks by relevance.
4-layer proxy chain — Automatic failover: MacBook relay -> NordVPN SOCKS5 -> AWS API Gateway IP rotation -> direct. Never get blocked.
Headless browsing — Playwright with stealth mode for JavaScript-rendered pages.
Structured extraction — Pull data from any page using CSS selectors, XPath, or LLM-powered extraction.
Change monitoring — Watch any URL for content changes with configurable intervals.
Community proxy pool — Earn credits by sharing bandwidth. Spend credits to use the network.

Quick Start

# Install
pip install agentsearchcli

# First run — creates account and gets API key
search

# Search the web
search "Python asyncio documentation"

# Output as JSON (for agents)
search query "React hooks tutorial" --format json

# Use headless browser for JS-heavy sites
search query "site:twitter.com AI news" --browser

# Crawl a docs site
search crawl https://docs.python.org --depth 3 --max-pages 100

# Extract structured data
search extract https://shop.com/products --schema schema.json --format json

# Monitor a page for changes (check every 30 min)
search monitor https://example.com/pricing --interval 1800

Installation

# Core (requests-based, no browser)
pip install agentsearchcli

# With headless browser support
pip install agentsearchcli[browser]

# From source
git clone https://github.com/r0botsorg/agent-search-cli.git
cd agent-search-cli
pip install -e ".[dev]"

Requirements: Python 3.9+ and an internet connection. Everything else is optional.

Architecture

┌─────────────────────────────────────────────────────┐
│                    CLI / Library                      │
│    search query | crawl | extract | monitor          │
└──────────────────────┬──────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────┐
│               Multi-Engine Search                    │
│    Google + DuckDuckGo + Bing + Wikipedia             │
└──────────────────────┬──────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────┐
│               4-Layer Proxy Chain                    │
│                                                      │
│    1. MacBook Relay     (residential IP)             │
│    2. NordVPN SOCKS5    (residential IP)             │
│    3. AWS API Gateway   (rotating datacenter IPs)    │
│    4. Direct            (fallback)                   │
│                                                      │
│    Auto-failover · CAPTCHA detection · Rate limiting │
└──────────────────────┬──────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────┐
│               Content Processing                     │
│                                                      │
│    HTML → Markdown · CSS/XPath extraction            │
│    LLM extraction · Change detection                 │
│    Playwright stealth · Session management           │
└─────────────────────────────────────────────────────┘

Modes

Mode	Cost	Proxies	Best For
Lite	Free	Self-managed (your proxies)	Developers with existing infrastructure
Pro	Paid	Fully managed	Teams who want zero setup
Pool	Free	Community-powered	Everyone — share bandwidth, earn credits

CLI Reference

Global Options

Option	Description
`--version`	Show version and exit
`--verbose` / `-v`	Enable debug logging
`--config PATH`	Path to custom config file
`--skip-onboarding`	Skip the first-run setup wizard

Search

search "your query"                              # quick search
search query "your query" --format json          # JSON output
search query "your query" --browser              # JS rendering
search query "your query" --extract "h1, .price" # CSS extraction
search query "your query" --pro                  # hosted mode
search query "your query" -o results.json        # save to file

Crawl

search crawl https://docs.example.com --depth 3 --max-pages 100

Extract

search extract https://shop.com/product --schema schema.json --format json

Monitor

search monitor https://example.com/pricing --interval 1800

Proxy Pool

search pool join       # contribute bandwidth, earn credits
search pool leave      # stop participating
search pool status     # your node status
search pool stats      # global network stats
search pool credits    # your balance

Auth

search auth login      # authenticate for Pro mode
search auth logout     # remove stored credentials
search auth status     # check auth state

Command Tree

search [QUERY]
├── query QUERY [--pro] [-f markdown|html|json] [-o PATH] [--extract CSS] [--browser]
├── crawl URL [--pro] [--depth N] [--max-pages N]
├── extract URL [--pro] [--schema PATH] [-f markdown|json]
├── monitor URL [--pro] [--interval N]
├── onboard
├── auth
│   ├── login
│   ├── logout
│   └── status
└── pool
    ├── join
    ├── leave
    ├── status
    ├── stats
    └── credits

13 commands total.

Python Library

Use Agent Search as a library in your own code:

from agent_search.core.proxy_chain import ProxyChain
from agent_search.core.multi_search import MultiEngineSearch
from agent_search.core.html_to_markdown import HTMLToMarkdown
from agent_search.core.data_extraction import DataExtractor
from agent_search.core.change_detector import ChangeDetector

# Proxy-aware HTTP requests with automatic failover
proxy = ProxyChain()
response = proxy.get("https://example.com")
data = await proxy.async_get("https://api.example.com/data")
proxies = proxy.get_best_proxies_dict()  # for use with requests

# Multi-engine search with dedup + ranking
engine = MultiEngineSearch()
results = engine.search("latest AI research", max_results=10)

# HTML to clean Markdown
converter = HTMLToMarkdown()
markdown = converter.convert(html, base_url="https://example.com")

# Structured data extraction
extractor = DataExtractor()
data = extractor.extract(url, selectors=["h1", ".price", ".description"])

# Change monitoring
detector = ChangeDetector()
changed = detector.check(url)  # returns True if content changed

Core Modules

Module	Description
`proxy_chain`	4-layer proxy with automatic failover
`multi_search`	Multi-engine search aggregation with dedup + ranking
`html_to_markdown`	Clean HTML-to-Markdown conversion
`data_extraction`	CSS, XPath, and LLM-powered structured extraction
`playwright_browser`	Headless Chrome with stealth mode
`batch_processor`	Async batch URL processing with concurrency control
`change_detector`	Content change monitoring via SHA-256 snapshots
`captcha_detector`	CAPTCHA and anti-bot block detection
`rate_limiter`	Thread-safe rate limiting with adaptive backoff
`retry_handler`	Exponential backoff with circuit breaker pattern
`sitemap_crawler`	URL discovery via sitemap.xml and robots.txt
`aws_ip_rotator`	AWS API Gateway IP rotation (new IP per request)
`nordvpn_proxy`	NordVPN SOCKS5 residential proxy support
`session_manager`	Persistent session and cookie storage
`user_agents`	27 real browser User-Agent strings with rotation
`llm_extractor`	LLM-powered intelligent data extraction

Configuration

Config is stored at ~/.config/agent-search/config.json (created on first run via onboarding wizard).

Environment Variables

Variable	Description
`AGENT_SEARCH_ENDPOINT`	Search endpoint URL (default: `http://localhost:15000`)
`AGENT_SEARCH_API_KEY`	Pro mode API key
`NORDVPN_SERVICE_USER`	NordVPN SOCKS5 username
`NORDVPN_SERVICE_PASS`	NordVPN SOCKS5 password
`AWS_API_GATEWAY_ID`	AWS API Gateway ID for IP rotation
`AWS_REGION`	AWS region (default: `us-east-1`)
`MACBOOK_PROXY_URL`	MacBook relay proxy URL
`MACBOOK_API_KEY`	MacBook relay auth key
`OPENAI_API_KEY`	For LLM-powered extraction
`BING_SEARCH_API_KEY`	Bing Search API key (optional engine)

Project Structure

agent-search-cli/
├── pyproject.toml                # Package config + entry points
├── src/agent_search/
│   ├── cli/                      # CLI layer (Click)
│   │   ├── main.py               # Command routing
│   │   ├── onboarding.py         # First-run setup wizard
│   │   └── commands/
│   │       ├── query.py          # Web search
│   │       ├── crawl.py          # Website crawling
│   │       ├── extract.py        # Data extraction
│   │       ├── monitor.py        # Change monitoring
│   │       ├── auth.py           # Authentication
│   │       └── pool.py           # Proxy pool management
│   ├── core/                     # Core library (usable independently)
│   │   ├── proxy_chain.py        # 4-layer proxy failover
│   │   ├── multi_search.py       # Multi-engine search
│   │   ├── html_to_markdown.py   # HTML → Markdown
│   │   ├── data_extraction.py    # Structured extraction
│   │   ├── playwright_browser.py # Headless browser
│   │   ├── batch_processor.py    # Async batch processing
│   │   ├── change_detector.py    # Change monitoring
│   │   ├── captcha_detector.py   # Anti-bot detection
│   │   ├── rate_limiter.py       # Rate limiting
│   │   ├── retry_handler.py      # Retry + circuit breaker
│   │   ├── sitemap_crawler.py    # Sitemap discovery
│   │   ├── aws_ip_rotator.py     # AWS IP rotation
│   │   ├── nordvpn_proxy.py      # NordVPN SOCKS5
│   │   ├── session_manager.py    # Session persistence
│   │   ├── llm_extractor.py      # LLM extraction
│   │   └── user_agents.py        # UA rotation
│   ├── pool/                     # Proxy pool network
│   └── utils/
│       ├── logger.py
│       └── version.py
└── tests/
    ├── test_*.py
    └── unit/

Development

git clone https://github.com/r0botsorg/agent-search-cli.git
cd agent-search-cli
pip install -e ".[dev]"
python -m pytest tests/ -v

About Qwerty

Agent Search is built by Qwerty (qwert.ai) — an AI-powered search platform designed specifically for agents and autonomous systems.

Traditional search wasn't built for the agent era. It was built for humans typing queries into search boxes. Qwerty is different: an agent-first search infrastructure built from the ground up for the software that's replacing manual workflows.

The Platform

Agent Search CLI is the open-source core of the Qwerty platform. The full stack includes:

Component	Description
Agent Search CLI	Open-source CLI and Python library (this repo)
Qwerty API	Hosted search API at `api.qwert.ai` — managed proxy infrastructure, no setup required
Proxy Pool	Community-powered proxy network — share bandwidth, earn credits

Pricing

Plan	Price	Requests	What You Get
Lite	Free	1,000/mo	Basic search, API access, community support
Pro	$49/mo	50,000/mo	Managed proxies, semantic search, priority support, analytics
Enterprise	$999/mo	Unlimited	Dedicated infrastructure, SLA, SSO, custom integrations

Start free at qwert.ai or self-host the entire stack with the open-source repos.

Contact

Email: hello@qwert.ai
Website: qwert.ai
Docs: qwert.ai/docs

License

MIT License. See LICENSE for details.

Built by Qwerty

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.1

Mar 1, 2026

1.0.0

Mar 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentsearchcli-1.0.1.tar.gz (86.7 kB view details)

Uploaded Mar 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentsearchcli-1.0.1-py3-none-any.whl (87.0 kB view details)

Uploaded Mar 1, 2026 Python 3

File details

Details for the file agentsearchcli-1.0.1.tar.gz.

File metadata

Download URL: agentsearchcli-1.0.1.tar.gz
Upload date: Mar 1, 2026
Size: 86.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for agentsearchcli-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`c9b107e575ebfbd62beda93c86bd5f3923b28eaafb071883e98bf960ab26cbab`
MD5	`82dd6f198c76f51852f69c0a4712254c`
BLAKE2b-256	`1ea20159c8ba0626607a4da8d31d8d66bdee9e112d8702264158eefd7761ba01`

See more details on using hashes here.

File details

Details for the file agentsearchcli-1.0.1-py3-none-any.whl.

File metadata

Download URL: agentsearchcli-1.0.1-py3-none-any.whl
Upload date: Mar 1, 2026
Size: 87.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for agentsearchcli-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d9be48c087116fc5b733bd699b3191145575a3de5f1b21d23a77a69cb57355f1`
MD5	`1212ef789fd555e1c7048db653b6b86b`
BLAKE2b-256	`f23382dc4a99a7a8df4916f62aa5f29951565ef724c2d3f0624b3863ea7ddd40`

See more details on using hashes here.

agentsearchcli 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Agent Search

Why Agent Search?

Quick Start

Installation

Architecture

Modes

CLI Reference

Global Options

Search

Crawl

Extract

Monitor

Proxy Pool

Auth

Command Tree

Python Library

Core Modules

Configuration

Environment Variables

Project Structure

Development

About Qwerty

The Platform

Pricing

Contact

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes