Skip to main content

Query WHATWG/W3C web specifications for AI agents and developers

Project description

webspec-index

Query WHATWG/W3C web specifications from the command line, Python code, or AI agents (MCP).

Features

  • Full-text search across HTML, DOM, URL, and other specifications
  • Cross-reference tracking (incoming/outgoing references between specs)
  • Fast SQLite-based indexing with FTS5 for instant queries
  • Three interfaces: CLI, Python library, and MCP server for AI agents

Installation

pip install webspec-index

Or run directly with uvx (no installation needed):

uvx webspec-index query HTML#navigate

If you install via pip, the webspec-index command is available globally. With uvx, prefix every command with uvx webspec-index instead.

The examples below assume pip install.

Quick Start

Command Line

# Query a specific section
webspec-index query HTML#navigate

# Search across all specs
webspec-index search "tree order" --spec DOM

# Check if a section exists (exit code 0 = found, 1 = not found)
webspec-index exists HTML#navigate

# Find anchors by pattern
webspec-index anchors "*-tree" --spec DOM

# List all headings
webspec-index list HTML

# Get cross-references
webspec-index refs HTML#navigate --direction incoming

# Update to latest spec versions
webspec-index update --spec HTML

# Clear local database
webspec-index clear-db

Most commands support --format json (default) or --format markdown.

Python Library

import webspec_index

# Query a section
result = webspec_index.query("HTML#navigate")
print(result["title"])  # "navigate"
print(result["section_type"])  # "Algorithm"

# Search
results = webspec_index.search("tree order", spec="DOM", limit=5)
for r in results["results"]:
    print(f"{r['spec']}#{r['anchor']}: {r['snippet']}")

# Check existence
if webspec_index.exists("HTML#navigate"):
    print("Section found!")

MCP Server (AI Agents)

Start the MCP server for use with Claude Code or other AI agents:

claude mcp add webspec-index -- uvx webspec-index mcp

Available Specifications

Currently indexed:

  • HTML - WHATWG HTML Living Standard
  • DOM - WHATWG DOM Living Standard
  • URL - WHATWG URL Living Standard
  • INFRA - WHATWG Infra Living Standard

More specs (Fetch, Encoding, Streams, etc.) coming soon!

How It Works

  1. Fetches spec HTML from WHATWG/W3C GitHub repositories
  2. Parses sections, algorithms, IDL definitions, and cross-references
  3. Indexes in SQLite with FTS5 for fast full-text search
  4. Tracks versions using git commit SHAs for reproducibility

Development

Built with:

  • Rust for fast parsing and indexing (scraper, rusqlite, reqwest)
  • PyO3 for zero-cost Python bindings
  • Maturin for packaging
  • Click for CLI
  • MCP (Model Context Protocol) for AI agent integration

License

MIT

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webspec_index-0.2.0.tar.gz (64.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

webspec_index-0.2.0-cp312-cp312-win_amd64.whl (4.2 MB view details)

Uploaded CPython 3.12Windows x86-64

webspec_index-0.2.0-cp312-cp312-manylinux_2_38_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.38+ x86-64

webspec_index-0.2.0-cp312-cp312-macosx_11_0_arm64.whl (4.7 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file webspec_index-0.2.0.tar.gz.

File metadata

  • Download URL: webspec_index-0.2.0.tar.gz
  • Upload date:
  • Size: 64.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.11.5

File hashes

Hashes for webspec_index-0.2.0.tar.gz
Algorithm Hash digest
SHA256 9c2c2edf149d2d5354feab0069a87d0e2f0d93779374eb3a7cd37e01e7e5820a
MD5 3562f78c51f16ea1c542934a6a2c5d67
BLAKE2b-256 873ae0780fe94d454d1c37a82f985bdf228951fa4bb3e542c76ba827901df7d4

See more details on using hashes here.

File details

Details for the file webspec_index-0.2.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for webspec_index-0.2.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 c8009efcb5be8fc94dad9ac37e909420a4e4a5e9550e377a7d7a07b2819fd8fc
MD5 d49dd195c28d1ade7e86224be66e80e6
BLAKE2b-256 d6fd0969d4c7d4fe89ef3ce6f32cc86a3ccb6a0a84bdf384c9ad66e843702112

See more details on using hashes here.

File details

Details for the file webspec_index-0.2.0-cp312-cp312-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for webspec_index-0.2.0-cp312-cp312-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 585d3654afc7b1f32d85b4af18f6c464b8d498a92b61434aac5c2f72cb74a6d7
MD5 5a2863895d38e921134b8b2546ab748f
BLAKE2b-256 e3677b80f759b0809c80ab77af47a9acc7eedcac8378395e9ea5581faa98caba

See more details on using hashes here.

File details

Details for the file webspec_index-0.2.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for webspec_index-0.2.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 895354402560b94bdf7f039df7ffd27ede0adeb61fd6e5b9c4e07aba5aa8eec9
MD5 e300b53b41ff286bac992992221c124d
BLAKE2b-256 9eae67a247b90275ac8754d5d0c394cc2a88a9ad21704763254f610cf274740c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page