*BitS* BitSearch Intelligence Engine — real-time, citation-backed web search & extraction for AI apps. Built on BitS.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Sudharsansm

These details have not been verified by PyPI

Project links

Bitscrape (core crawler)

Project description

BIE — BitSearch Intelligence Engine

A real-time web search engine for AI applications — no API keys, no subscriptions, no third-party search services.

BIE discovers relevant pages on the live internet using free, public search endpoints, crawls them (powered by BitS, our high-performance async crawler), builds a hybrid BM25 + semantic vector index in memory, and returns ranked, source-attributed results — all from a single Python call, REST endpoint, CLI command, or MCP tool.

import bie

# Search the live internet — no URLs, no API key, no subscription
results = bie.websearch("latest semiconductor export rules 2026")

for r in results:
    print(r.title, "—", r.url, f"(score={r.score:.3f})")
    print(r.snippet)

Why BIE?

🌐 Free, real-time web search — no API keys, no subscriptions, no third-party search providers. Discovery uses public, no-key search endpoints with automatic fallback.
🚀 Zero infra — no Elasticsearch, no Milvus, no Kafka. Pure Python, in-memory hybrid index. Scale up later if you need to.
🧠 Hybrid retrieval out of the box — BM25 lexical search fused with sentence-transformer embeddings via Reciprocal Rank Fusion.
🤖 MCP-ready — drop-in tool for Claude Desktop, Claude Code, and any MCP-compatible AI app.
⚡ Powered by Bitscrape — async, polite (robots.txt-aware), and fast crawling/extraction under the hood.
🔌 Use anywhere — Python library, REST API, CLI, or MCP server.

Install

pip install bits-bie

Note: the PyPI distribution is named bits-bie (since bie was too similar to an existing PyPI project), but you still import bie and run the bie CLI command — same API as shown below.

Optional extras:

pip install "bits-bie[embeddings]"  # semantic/vector search (sentence-transformers)
pip install "bits-bie[server]"      # FastAPI + Uvicorn REST server
pip install "bits-bie[mcp]"         # Model Context Protocol server
pip install "bits-bie[all]"         # everything

BIE depends on bitscrape, our proprietary async crawling & extraction framework, which is installed automatically.

Usage

0. Search the live internet — no URLs, no API key, no subscription

import bie

results = bie.websearch("who won the latest F1 race")
for r in results:
    print(r.title, "—", r.url)
    print(r.snippet)

This is BIE's primary, "type a question, get a real-time answer from the internet" mode. It:

Discovers candidate URLs for your query via free, public, no-key search endpoints (DuckDuckGo, with an automatic Bing fallback for reliability)
Crawls them with Bitscrape
Extracts and chunks the page text, then ranks it against your query with BIE's hybrid BM25 + vector index

No accounts, no API keys, no rate-limited paid tiers — everything runs locally using publicly accessible search and the Bitscrape crawler.

1. One-shot search of specific sites (Python)

import bie

results = bie.search("AI regulation news", urls=["https://example.com/news"], top_k=5)
for r in results:
    print(r)

2. Build a reusable index

from bie import BIE

engine = BIE()
engine.crawl(["https://example.com/blog", "https://another-site.com"])

print(engine.search("quarterly earnings"))
print(engine.search("product launch"))  # reuses the same index

3. Index your own text (no crawling)

engine.add_text(
    url="internal://doc-1",
    title="Q2 Strategy Memo",
    text="...",
    trust_score=1.0,
)

4. CLI

# Search the whole internet — no URLs needed
bie search-live "who won the latest F1 race"

# Crawl + search specific sites in one command
bie search "global markets today" --url https://www.bbc.com/news --top-k 5

# Just crawl & dump extracted pages
bie crawl https://example.com --max-pages 20 --out docs.jsonl

# Run the REST API
bie serve --port 8000

# Run as an MCP server (stdio)
bie mcp

5. REST API

bie serve --port 8000

curl -X POST http://localhost:8000/crawl/url \
  -H "Content-Type: application/json" \
  -d '{"urls": ["https://example.com/news"]}'

curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -d '{"query": "latest news", "top_k": 5}'

curl -X POST http://localhost:8000/search/live \
  -H "Content-Type: application/json" \
  -d '{"query": "who won the latest F1 race", "top_k": 5}'

See the full endpoint contract in docs/API.md.

6. MCP (Model Context Protocol)

Add BIE as a tool in your MCP client (e.g. claude_desktop_config.json):

{
  "mcpServers": {
    "bie": {
      "command": "bie",
      "args": ["mcp"]
    }
  }
}

This exposes four tools to your AI assistant:

bie_web_search(query, top_k, deep) — search the entire web, no URLs needed (DuckDuckGo discovery + Bitscrape crawl, no API key)
bie_search(query, urls, top_k, max_pages) — crawl + search specific URLs in one call
bie_crawl(urls, max_pages) — crawl & index into a session-persistent store
bie_index_search(query, top_k) — search the session index

Configuration

All settings can be set via environment variables prefixed with BIE_, or passed directly:

from bie import BIE, BIESettings

engine = BIE(BIESettings(
    max_pages=20,
    max_depth=1,
    use_embeddings=True,
    embedding_model="sentence-transformers/all-MiniLM-L6-v2",
    bm25_weight=0.6,
    vector_weight=0.4,
))

Setting	Env var	Default	Description
`max_pages`	`BIE_MAX_PAGES`	`40`	Max pages crawled per seed URL
`max_depth`	`BIE_MAX_DEPTH`	`2`	Max link-follow depth
`concurrent_requests`	`BIE_CONCURRENT_REQUESTS`	`16`	Crawl concurrency
`robotstxt_obey`	`BIE_ROBOTSTXT_OBEY`	`true`	Respect robots.txt
`use_embeddings`	`BIE_USE_EMBEDDINGS`	`true`	Enable semantic search
`chunk_size`	`BIE_CHUNK_SIZE`	`800`	Chars per chunk
`bm25_weight` / `vector_weight`	`BIE_BM25_WEIGHT` / `BIE_VECTOR_WEIGHT`	`0.5` / `0.5`	Fusion weights
`api_key`	`BIE_API_KEY`	`None`	If set, requires `Authorization: Bearer <key>`

Architecture

              ┌─────────────────────────────────────────┐
              │                  bie                     │
              │                                           │
   urls ──▶   │  Crawler (Bitscrape)                     │
              │     │                                     │
              │     ▼                                     │
              │  Document → Chunker → HybridIndex         │
              │                         │   │             │
              │                  BM25Index  VectorIndex   │
              │                         │   │             │
              │                       Fusion (RRF)        │
              │                         │                 │
   query ──▶  │                         ▼                 │
              │                  Ranked SearchResults      │
              └─────────────────────────────────────────┘
                     │            │            │
                  Python API   REST API    MCP Server

This OSS edition implements the core of the BIE PRD's Module 1 (Crawler), Module 2 (Indexes), Module 3 (Hybrid Retriever), and Module 11 (Agent API) as a single lightweight package — no external services required. Larger deployments can swap BM25Index/VectorIndex for Elasticsearch/Milvus-backed implementations behind the same HybridIndex interface.

Built on BitS

BIE's crawling and extraction layer is powered by BitS (pip install bitscrape), our async, robots.txt-aware web scraping framework — giving BIE high-performance, polite, production-grade crawling out of the box.

License

MIT — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Sudharsansm

These details have not been verified by PyPI

Project links

Bitscrape (core crawler)

Release history Release notifications | RSS feed

1.2.1

Jun 13, 2026

1.2.0

Jun 13, 2026

This version

1.1.0

Jun 12, 2026

0.3.0

Jun 12, 2026

0.2.0

Jun 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bits_bie-1.1.0.tar.gz (25.1 kB view details)

Uploaded Jun 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bits_bie-1.1.0-py3-none-any.whl (28.1 kB view details)

Uploaded Jun 12, 2026 Python 3

File details

Details for the file bits_bie-1.1.0.tar.gz.

File metadata

Download URL: bits_bie-1.1.0.tar.gz
Upload date: Jun 12, 2026
Size: 25.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bits_bie-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`901abf0304fcffe255e602b2436176633775c6f20dccdd22557c13917363af48`
MD5	`d736fdfa9fdf777d15ee0c431b995094`
BLAKE2b-256	`67ba5e92db2ada5f7c799938f3a4130387f42c50dcd8568c8096799bdbccb67c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bits_bie-1.1.0.tar.gz:

Publisher: publish.yml on Sudharsansm/BIE

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bits_bie-1.1.0.tar.gz
- Subject digest: 901abf0304fcffe255e602b2436176633775c6f20dccdd22557c13917363af48
- Sigstore transparency entry: 1804820882
- Sigstore integration time: Jun 12, 2026
Source repository:
- Permalink: Sudharsansm/BIE@2c25e5dee48fe07fe7105f48cccb8458206e8565
- Branch / Tag: refs/tags/v1.1.0
- Owner: https://github.com/Sudharsansm
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2c25e5dee48fe07fe7105f48cccb8458206e8565
- Trigger Event: workflow_dispatch

File details

Details for the file bits_bie-1.1.0-py3-none-any.whl.

File metadata

Download URL: bits_bie-1.1.0-py3-none-any.whl
Upload date: Jun 12, 2026
Size: 28.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bits_bie-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ba35f0a9ca0160e82e16924c3b56a01ce89306e10803e1401b1a30e65fdf1089`
MD5	`be3ff215cef3f9c108ecf6bc2ca1717e`
BLAKE2b-256	`23976c578b1f2032441c0b7715abc10530bcf03c4bc170fc1b1bd0a80d6a16cd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bits_bie-1.1.0-py3-none-any.whl:

Publisher: publish.yml on Sudharsansm/BIE

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bits_bie-1.1.0-py3-none-any.whl
- Subject digest: ba35f0a9ca0160e82e16924c3b56a01ce89306e10803e1401b1a30e65fdf1089
- Sigstore transparency entry: 1804821094
- Sigstore integration time: Jun 12, 2026
Source repository:
- Permalink: Sudharsansm/BIE@2c25e5dee48fe07fe7105f48cccb8458206e8565
- Branch / Tag: refs/tags/v1.1.0
- Owner: https://github.com/Sudharsansm
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2c25e5dee48fe07fe7105f48cccb8458206e8565
- Trigger Event: workflow_dispatch

bits-bie 1.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

BIE — BitSearch Intelligence Engine

Why BIE?

Install

Usage

0. Search the live internet — no URLs, no API key, no subscription

1. One-shot search of specific sites (Python)

2. Build a reusable index

3. Index your own text (no crawling)

4. CLI

5. REST API

6. MCP (Model Context Protocol)

Configuration

Architecture

Built on BitS

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance