Index and search Bikeshed (.bs) source documents from URLs with efficient change detection
Project description
search-bs
Index and search Bikeshed (.bs) source documents from URLs with efficient change detection.
Features
- Efficient change detection: Only re-indexes when content actually changes
- Uses HTTP conditional requests (ETag, Last-Modified)
- Falls back to SHA256 content hashing
- Full-text search: Powered by SQLite FTS5 with BM25 ranking
- Context-aware search: Show N lines around each match with
--aroundflag - Exact line retrieval: Pull specific line ranges for agent consumption
- Batch indexing: Index multiple documents from a config file with
--all - Markdown context: Tracks current heading for each search result
- GitHub integration: Automatically converts GitHub blob URLs to raw URLs
- JSON output: Machine-readable output for all commands
Installation
pip install search-bikeshed
Or install from source:
git clone https://github.com/tarekziade/search-bikeshed
cd search-bs
pip install -e .
Usage
Index a document
# Index from a URL
search-bs index https://github.com/webmachinelearning/webnn/blob/main/index.bs --name webnn
# GitHub blob URLs are automatically converted to raw URLs
search-bs index https://raw.githubusercontent.com/w3c/webrtc-pc/main/webrtc.bs --name webrtc
# Index all documents from config file
search-bs index --all
Batch indexing with config file
Create a config file at ~/.config/search-bs/sources.json:
[
{
"name": "webnn",
"url": "https://github.com/webmachinelearning/webnn/blob/main/index.bs"
},
{
"name": "webrtc",
"url": "https://raw.githubusercontent.com/w3c/webrtc-pc/main/webrtc.bs"
}
]
Then index all at once:
search-bs index --all
Search indexed documents
# Basic search
search-bs search --name webnn "MLTensor"
# Phrase search
search-bs search --name webnn "graph builder"
# Search with context lines (show 3 lines around each match)
search-bs search --name webnn "MLContext" --around 3
# With JSON output
search-bs search --name webnn "MLContext" --json
# Show URLs in results
search-bs search --name webnn "operator" --show-url --max-results 10
Get exact line ranges
Retrieve specific line ranges from indexed documents (useful for agents):
# Get 40 lines starting from line 1234
search-bs get --name webnn --line 1234 --count 40
# JSON output
search-bs get --name webnn --line 1234 --count 40 --json
List indexed documents
# Human-readable list
search-bs docs
# JSON output
search-bs docs --json
How it works
- Indexing: Fetches .bs files from URLs and indexes them line-by-line with heading context
- Change detection: Uses HTTP conditional requests and content hashing to skip unchanged documents
- Search: Uses SQLite FTS5 full-text search with BM25 ranking for relevance
Data storage
By default, the index is stored in ~/.cache/search-bs/search-bs.sqlite3
You can override this location with the BIKESEARCH_HOME environment variable:
export BIKESEARCH_HOME=/custom/path
search-bs index ...
Requirements
- Python 3.8 or later
- SQLite 3 with FTS5 support (included in Python 3.6+)
- No external dependencies (stdlib only)
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file search_bikeshed-0.1.0.tar.gz.
File metadata
- Download URL: search_bikeshed-0.1.0.tar.gz
- Upload date:
- Size: 9.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4346ff1c844746eef2919d429d8a6711994bce7fa69632089f070538f6f825d6
|
|
| MD5 |
ae8ee95a8c6ecfd8e6a867b6a3a151c2
|
|
| BLAKE2b-256 |
00be9c3e750755b6771db50244d27d7164d50e8180ff17a957bb3bdd028e557d
|
Provenance
The following attestation bundles were made for search_bikeshed-0.1.0.tar.gz:
Publisher:
publish-pypi.yml on tarekziade/search-bikeshed
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
search_bikeshed-0.1.0.tar.gz -
Subject digest:
4346ff1c844746eef2919d429d8a6711994bce7fa69632089f070538f6f825d6 - Sigstore transparency entry: 774499207
- Sigstore integration time:
-
Permalink:
tarekziade/search-bikeshed@fce67b0b182fc1102a4218214d339037c969a984 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/tarekziade
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@fce67b0b182fc1102a4218214d339037c969a984 -
Trigger Event:
release
-
Statement type: