Context CLI — LLM Readiness Linter for token efficiency and RAG readiness
Project description
Context CLI
Lint any URL for LLM readiness. Get a 0-100 score for token efficiency, RAG readiness, and LLM extraction quality.
What is Context CLI?
Context CLI is an LLM Readiness Linter that checks how well a URL is structured for AI consumption. As LLM-powered search engines, RAG pipelines, and AI agents become primary consumers of web content, your pages need to be optimized for token efficiency, structured data extraction, and machine-readable formatting.
Context CLI analyzes your content across four pillars and returns a structured score from 0 to 100.
Features
- Robots.txt AI bot access -- checks 13 AI crawlers (GPTBot, ClaudeBot, DeepSeek-AI, Grok, and more)
- llms.txt & llms-full.txt -- detects both standard and extended LLM instruction files
- Schema.org JSON-LD -- extracts and evaluates structured data with high-value type weighting (Product, Article, FAQ, HowTo)
- Content density -- measures useful content vs. boilerplate with readability scoring, heading structure analysis, and answer-first detection
- Batch mode -- lint multiple URLs from a file with
--fileand configurable--concurrency - Custom bot list -- override default bots with
--botsfor targeted checks - Verbose output -- detailed per-pillar breakdown with scoring explanations and recommendations
- Rich CLI output -- formatted tables and scores via Rich
- JSON / CSV / Markdown output -- machine-readable results for pipelines
- MCP server -- expose the linter as a tool for AI agents via FastMCP
- Context Compiler -- LLM-powered
llms.txtandschema.jsonldgeneration, with batch mode for multiple URLs - CI/CD integration --
--fail-underthreshold,--fail-on-blocked-bots, per-pillar thresholds, baseline regression detection, GitHub Step Summary - GitHub Action -- composite action for CI pipelines with baseline support
- Citation Radar -- query AI models to see what they cite and recommend, with brand tracking and domain classification
- Share-of-Recommendation Benchmark -- track how often AI models mention and recommend your brand vs competitors, with LLM-as-judge analysis
Installation
pip install context-linter
Context CLI uses a headless browser for content extraction. After installing, run:
crawl4ai-setup
Development install
git clone https://github.com/your-org/context-cli.git
cd context-cli
pip install -e ".[dev]"
crawl4ai-setup
Quick Start
context-cli lint example.com
This runs a full lint and prints a Rich-formatted report with your LLM readiness score.
CLI Usage
Single Page Lint
Lint only the specified URL (skip multi-page discovery):
context-cli lint example.com --single
Multi-Page Site Lint (default)
Discover pages via sitemap/spider and lint up to 10 pages:
context-cli lint example.com
Limit Pages
context-cli lint example.com --max-pages 5
JSON Output
Get structured JSON for CI pipelines, dashboards, or scripting:
context-cli lint example.com --json
CSV / Markdown Output
context-cli lint example.com --format csv
context-cli lint example.com --format markdown
Verbose Mode
Show detailed per-pillar breakdown with scoring explanations:
context-cli lint example.com --single --verbose
Timeout
Set the HTTP timeout (default: 15 seconds):
context-cli lint example.com --timeout 30
Custom Bot List
Override the default 13 bots with a custom list:
context-cli lint example.com --bots "GPTBot,ClaudeBot,PerplexityBot"
Batch Mode
Lint multiple URLs from a file (one URL per line, .txt or .csv):
context-cli lint --file urls.txt
context-cli lint --file urls.txt --concurrency 5
context-cli lint --file urls.txt --format csv
CI Mode
Fail the build if the score is below a threshold:
context-cli lint example.com --fail-under 60
Fail if any AI bot is blocked:
context-cli lint example.com --fail-on-blocked-bots
Per-Pillar Thresholds
Gate CI on individual pillar scores:
context-cli lint example.com --robots-min 20 --content-min 30 --overall-min 60
Available: --robots-min, --schema-min, --content-min, --llms-min, --overall-min.
Baseline Regression Detection
Save a baseline and detect score regressions in future lints:
# Save current scores as baseline
context-cli lint example.com --single --save-baseline .context-baseline.json
# Compare against baseline (exit 1 if any pillar drops > 5 points)
context-cli lint example.com --single --baseline .context-baseline.json
# Custom regression threshold
context-cli lint example.com --single --baseline .context-baseline.json --regression-threshold 10
Exit codes: 0 = pass, 1 = score below threshold or regression detected, 2 = bots blocked.
When running in GitHub Actions, a markdown summary is automatically written to $GITHUB_STEP_SUMMARY.
Quiet Mode
Suppress output, exit code 0 if score >= 50, 1 otherwise:
context-cli lint example.com --quiet
Use --fail-under with --quiet to override the default threshold:
context-cli lint example.com --quiet --fail-under 70
Start MCP server
context-cli mcp
Launches a FastMCP stdio server exposing the linter as a tool for AI agents.
MCP Integration
To use Context CLI as a tool in Claude Desktop, add this to your Claude Desktop config (claude_desktop_config.json):
{
"mcpServers": {
"context-cli": {
"command": "context-cli",
"args": ["mcp"]
}
}
}
Once configured, Claude can call the audit_url tool directly to check any URL's LLM readiness.
Context Compiler (Generate)
Generate llms.txt and schema.jsonld files from any URL using LLM analysis:
pip install context-linter[generate]
context-cli generate example.com
This crawls the URL, sends the content to an LLM, and writes optimized files to ./context-output/.
Batch Generate
Generate assets for multiple URLs from a file:
context-cli generate-batch urls.txt
context-cli generate-batch urls.txt --concurrency 5 --profile ecommerce
context-cli generate-batch urls.txt --json
Each URL's output goes to a subdirectory under --output-dir.
BYOK (Bring Your Own Key)
The generate command auto-detects your LLM provider from environment variables:
| Priority | Env Variable | Model Used |
|---|---|---|
| 1 | OPENAI_API_KEY |
gpt-4o-mini |
| 2 | ANTHROPIC_API_KEY |
claude-3-haiku-20240307 |
| 3 | Ollama running locally | ollama/llama3.2 |
Override with --model:
context-cli generate example.com --model gpt-4o
Industry Profiles
Tailor the output with --profile:
context-cli generate example.com --profile saas
context-cli generate example.com --profile ecommerce
Available: generic, cpg, saas, ecommerce, blog.
Citation Radar
Query AI models to see what they cite and recommend for any search prompt:
pip install context-linter[generate]
context-cli radar "best project management tools" --brand Asana --brand Monday --model gpt-4o-mini
Options:
--brand/-b: Brand name to track (repeatable)--model/-m: LLM model to query (repeatable, default: gpt-4o-mini)--runs/-r: Runs per model for statistical significance--json: Output as JSON
Share-of-Recommendation Benchmark
Track how AI models mention and recommend your brand across multiple prompts:
pip install context-linter[generate]
context-cli benchmark prompts.txt -b "YourBrand" -c "Competitor1" -c "Competitor2"
Options:
prompts.txt: CSV (withprompt,category,intentcolumns) or plain text (one prompt per line)--brand/-b: Target brand to track (required)--competitor/-c: Competitor brand (repeatable)--model/-m: LLM model to query (repeatable, default: gpt-4o-mini)--runs/-r: Runs per model per prompt (default: 3)--yes/-y: Skip cost confirmation prompt--json: Output as JSON
GitHub Action
Use Context CLI in your CI pipeline:
- name: Run Context Lint
uses: hanselhansel/context-cli@main
with:
url: 'https://your-site.com'
fail-under: '60'
With baseline regression detection:
- name: Run Context Lint
uses: hanselhansel/context-cli@main
with:
url: 'https://your-site.com'
baseline-file: '.context-baseline.json'
save-baseline: '.context-baseline.json'
regression-threshold: '5'
The action sets up Python, installs context-cli, and runs the lint. Outputs score and report-json for downstream steps. See docs/ci-integration.md for full documentation.
Score Breakdown
Context CLI returns a score from 0 to 100, composed of four pillars:
| Pillar | Max Points | What it measures |
|---|---|---|
| Content density | 40 | Quality and depth of extractable text content |
| Robots.txt AI bot access | 25 | Whether AI crawlers are allowed in robots.txt |
| Schema.org JSON-LD | 25 | Structured data markup (Product, Article, FAQ, etc.) |
| llms.txt presence | 10 | Whether a /llms.txt file exists for LLM guidance |
Scoring rationale (2026-02-18)
The weights reflect how AI search engines (ChatGPT, Perplexity, Claude) actually consume web content:
- Content density (40 pts) is weighted highest because it's what LLMs extract and cite when answering questions. Rich, well-structured content with headings and lists gives AI better material to work with.
- Robots.txt (25 pts) is the gatekeeper -- if a bot is blocked, it literally cannot crawl. It's critical but largely binary (either you're blocking or you're not).
- Schema.org (25 pts) provides structured "cheat sheets" that help AI understand entities. High-value types (Product, Article, FAQ, HowTo, Recipe) receive bonus weighting. Valuable but not required for citation.
- llms.txt (10 pts) is an emerging standard. Both
/llms.txtand/llms-full.txtare checked. No major AI search engine heavily weights it yet, but it signals forward-thinking AI readiness.
AI Bots Checked
Context CLI checks access rules for 13 AI crawlers:
- GPTBot
- ChatGPT-User
- Google-Extended
- ClaudeBot
- PerplexityBot
- Amazonbot
- OAI-SearchBot
- DeepSeek-AI
- Grok
- Meta-ExternalAgent
- cohere-ai
- AI2Bot
- ByteSpider
Development
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Lint
ruff check src/ tests/
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file context_linter-2.0.1.tar.gz.
File metadata
- Download URL: context_linter-2.0.1.tar.gz
- Upload date:
- Size: 235.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dbf088e72efe44acfe7d3498c94332f8e7e22455d23bb0bfe0dab7d7dff86e19
|
|
| MD5 |
229e9f0a0d1e58ca05275e2333723439
|
|
| BLAKE2b-256 |
3c251a2a23d82e194afdfdead529beb5d5b63113cfbc983c17ee2f6f30e56e09
|
Provenance
The following attestation bundles were made for context_linter-2.0.1.tar.gz:
Publisher:
publish.yml on hanselhansel/context-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
context_linter-2.0.1.tar.gz -
Subject digest:
dbf088e72efe44acfe7d3498c94332f8e7e22455d23bb0bfe0dab7d7dff86e19 - Sigstore transparency entry: 972402447
- Sigstore integration time:
-
Permalink:
hanselhansel/context-cli@bcfde670ffc1330f8d8638ab3387c572f41256d6 -
Branch / Tag:
refs/tags/v2.0.1 - Owner: https://github.com/hanselhansel
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@bcfde670ffc1330f8d8638ab3387c572f41256d6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file context_linter-2.0.1-py3-none-any.whl.
File metadata
- Download URL: context_linter-2.0.1-py3-none-any.whl
- Upload date:
- Size: 131.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97cfd667380eaecd9017333d0d8897baeb882ebe692474c39d97326b9366ffb3
|
|
| MD5 |
c9660577ca69f46f7f8e4dc97657ded9
|
|
| BLAKE2b-256 |
6ceba4edf50a453e23114a298ee1b8c824bb09d308be8c8524c6dfa638a1e772
|
Provenance
The following attestation bundles were made for context_linter-2.0.1-py3-none-any.whl:
Publisher:
publish.yml on hanselhansel/context-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
context_linter-2.0.1-py3-none-any.whl -
Subject digest:
97cfd667380eaecd9017333d0d8897baeb882ebe692474c39d97326b9366ffb3 - Sigstore transparency entry: 972402448
- Sigstore integration time:
-
Permalink:
hanselhansel/context-cli@bcfde670ffc1330f8d8638ab3387c572f41256d6 -
Branch / Tag:
refs/tags/v2.0.1 - Owner: https://github.com/hanselhansel
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@bcfde670ffc1330f8d8638ab3387c572f41256d6 -
Trigger Event:
push
-
Statement type: