Crawler/SEO companion for Dash apps: /llms.txt, /robots.txt, /sitemap.xml, bot detection, static-HTML prerender, and an MCP bridge. Works with Flask, FastAPI, and Quart backends (Dash 4.1+).
Project description
dash-improve-my-llms
Crawler / SEO companion for Dash apps, with a thin MCP bridge for Dash 4.3+.
What 2.0 is
A small package that handles the parts of "making your Dash app AI-friendly" that Dash itself doesn't:
/robots.txt— bot-class access policies (block AI training, allow AI search, etc.)/sitemap.xml— generated fromdash.page_registryminus hidden pages/<page>/llms.txt— each page's hand-written prose at a predictable URL- Static-HTML prerender — bot middleware serves crawlers the prose view instead of an empty Dash JS shell
- MCP bridge — registers each page's prose as a
dash.mcpresource on Dash 4.3+
It does not try to introspect your layouts or callbacks. Dash 4.3's MCP server does that natively and better.
The three audiences
| Audience | How they reach the app | What 2.0 serves them |
|---|---|---|
| MCP clients | JSON-RPC over Streamable HTTP | LLMS_DOC registered as dash.mcp resource |
| Web crawlers | Plain HTTPS, often no JS | /robots.txt, /sitemap.xml, static HTML |
| Paste-into-chat users | One-shot HTTP fetch | /llms.txt, /<page>/llms.txt as markdown |
FastAPI's /docs describes HTTP routes, not callbacks or layouts. MCP
fills that gap for audience #1. Audiences #2 and #3 are unchanged by
either — and they have no native Dash story. That's the gap this
package fills.
Install
Pick the extra that matches your Dash backend:
pip install "dash-improve-my-llms[flask]" # Dash 3.x (default)
pip install "dash-improve-my-llms[fastapi]" # Dash 4.1+
pip install "dash-improve-my-llms[quart]" # Dash 4.1+ async
pip install "dash-improve-my-llms[all]" # all three
The package detects which backend app.server is using and dispatches
to the right adapter. Your code looks the same regardless.
Quick start
from dash import Dash, register_page
from dash_improve_my_llms import add_llms_routes, RobotsConfig, mark_hidden
app = Dash(__name__, use_pages=True)
app._base_url = "https://myapp.com"
app._robots_config = RobotsConfig(
block_ai_training=True, # GPTBot, CCBot, anthropic-ai → 403
allow_ai_search=True, # ClaudeBot, ChatGPT-User → allowed
allow_traditional=True, # Googlebot, Bingbot → allowed
)
mark_hidden("/admin")
add_llms_routes(app)
if __name__ == "__main__":
app.run(debug=True)
Every page module then exports the prose for its own /llms.txt:
# pages/equipment.py
from dash import html, register_page
register_page(__name__, path="/equipment", name="Equipment Catalog")
LLMS_DOC = """\\
# Equipment Catalog
Browse the equipment library with text search and a category dropdown.
## What this page does
...
"""
def layout():
return html.Div([...])
That's the whole pattern. The LLMS_DOC string IS the body of
/equipment/llms.txt, byte-for-byte.
If a page has no LLMS_DOC, you'll see a single UserWarning at
add_llms_routes() naming the missing pages, and the endpoint returns
a small placeholder stub so bots still get a 200.
What gets served
| Route | What it returns | For |
|---|---|---|
/llms.txt |
Home page's LLMS_DOC |
Paste-into-chat |
/<page>/llms.txt |
That page's LLMS_DOC |
Paste-into-chat, AI-aware crawlers |
/robots.txt |
Bot policy generated from RobotsConfig |
Crawlers |
/sitemap.xml |
Non-hidden pages from page_registry |
Crawlers, search engines |
| (any URL with crawler UA) | Static HTML with the page's LLMS_DOC rendered |
Crawlers that can't run JS |
Plus, on Dash 4.3+:
| Surface | What | For |
|---|---|---|
llms:///<page-path> |
MCP resource carrying that page's LLMS_DOC |
Claude Desktop, agentic IDEs, MCP clients |
Public API
from dash_improve_my_llms import (
add_llms_routes, # main entry point
LLMSConfig, # opt-out flags
RobotsConfig, # bot-class policies
register_page_metadata, # name, description, llms_doc, schema.org fields
mark_hidden, # exclude path from sitemap/robots/MCP
is_hidden, # query
)
LLMSConfig
LLMSConfig(
enabled=True, # set False to no-op the package
warn_missing_llms_doc=True, # the startup UserWarning
register_mcp_resources=True, # set False to skip MCP bridge
)
RobotsConfig
RobotsConfig(
block_ai_training=True,
allow_ai_search=True,
allow_traditional=True,
crawl_delay=10, # seconds between requests
custom_rules=[], # extra robots.txt lines
disallowed_paths=["/admin"],
)
Bot classes
The middleware classifies User-Agents into three buckets:
- AI Training (default: blocked) — GPTBot, anthropic-ai, Claude-Web, CCBot, Google-Extended, FacebookBot, Omgili, ByteSpider
- AI Search (default: allowed) — ChatGPT-User, ClaudeBot, PerplexityBot, OAI-SearchBot
- Traditional (default: allowed) — Googlebot, Bingbot, DuckDuckBot, Yandex, plus generic patterns
Verify with curl:
# Training bot — 403 when block_ai_training=True
curl -A "Mozilla/5.0 (compatible; GPTBot/1.0)" https://myapp.com/
# Search bot — prerendered static HTML
curl -A "Mozilla/5.0 (compatible; Googlebot/2.1)" https://myapp.com/
The MCP bridge
When dash.mcp is available (Dash 4.3+ RC and later), 2.0 registers
each non-hidden page's LLMS_DOC as an MCP resource:
- URI:
llms:///<page-path>(e.g.llms:///audiences/mcp-clients) - mimeType:
text/markdown - content: the page's
LLMS_DOC, byte-for-byte identical to/<page>/llms.txt
MCP-aware clients (Claude Desktop, agentic IDEs) can resources/list
to discover what's available and resources/read by URI to fetch.
On Dash 3.x or 4.1/4.2 stable, the bridge is a silent no-op — only the HTTP surfaces serve docs.
Migrating from 1.x
Most of the change is removal. Run the package against your app and
the startup UserWarning will tell you which pages need attention.
-
Add
LLMS_DOCat module scope on each page module:LLMS_DOC = """\\ # Page Title Short description. ## What this page does ... """
Or pass it via
register_page_metadata(path, llms_doc="..."). -
Remove
mark_important()andmark_component_hidden()calls. They're deprecation no-ops in 2.0 and will be deleted in 2.1. -
Remove links to dropped routes:
/page.json,/architecture.txt,/architecture.toon,/llms.toon(and their per-page variants) all return 404 now. -
Install the matching backend extra:
[flask],[fastapi], or[quart]. The baredash-improve-my-llmsinstall no longer pulls Flask automatically.
The HTTP surfaces that survived (/llms.txt, /robots.txt,
/sitemap.xml) and the RobotsConfig, mark_hidden,
register_page_metadata APIs are byte-compatible with 1.x.
Example app
This repository's app.py is a working demo. From a clone:
pip install -e ".[all,dev]"
python app.py
# Browse http://localhost:8959/
The /audiences/* pages walk through each of the three audiences in
the running app, including a copy-to-clipboard demo for grabbing any
page's prose.
Documentation
- docs/SKILLS.md — practical guide for using and configuring the package (also written as a reference for AI coding assistants)
- CHANGELOG.md — full release history including 1.x
License
MIT. See LICENSE.
Built by Pip Install Python LLC.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dash_improve_my_llms-2.0.0.tar.gz.
File metadata
- Download URL: dash_improve_my_llms-2.0.0.tar.gz
- Upload date:
- Size: 39.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a712fd0749a19ba817665805228d5eed28acb779dc34b7e15744deab9fd5b91a
|
|
| MD5 |
05d0c691f60b8bfcfd6331227b20692d
|
|
| BLAKE2b-256 |
d2d9f35add5ecabe32619c127e19883171d70be81071d3fd9939619d48cfe48d
|
File details
Details for the file dash_improve_my_llms-2.0.0-py3-none-any.whl.
File metadata
- Download URL: dash_improve_my_llms-2.0.0-py3-none-any.whl
- Upload date:
- Size: 26.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0231bbe8166106977559e89bde9e9b8f00af12709185be47e40ae453dce30d5b
|
|
| MD5 |
a57650661c00e5bb8fdd7a87b744159a
|
|
| BLAKE2b-256 |
a4615efdb4b7ecf35db4b58c008354d298156fc5ccc838d7c1c8bde9fe349328
|