Skip to main content

Find wrong information in technical docs online

Project description

pantsonfire ๐Ÿ”ฅ

Find wrong information in technical documentation online. A tool for detecting outdated, incorrect, or deprecated information in blog posts and technical articles by cross-referencing against official documentation.

โœจ Key Features

  • ๐Ÿง  Natural Language Analysis: Use simple English commands like "find outdated API info on tech blogs"
  • ๐Ÿ•ท๏ธ Intelligent Web Crawling: Automatically discover similar issues across entire websites
  • ๐Ÿ“š Oxen AI Integration: Versioned, traceable storage with complete audit trails
  • ๐Ÿ” Multi-Level Detection: Pattern matching + AI-powered analysis for comprehensive coverage
  • ๐ŸŒ Universal Sources: Websites, GitHub repos, documentation sites, local files
  • ๐Ÿ“Š Rich Reporting: Browser-integrated reports with JSON/CSV export
  • ๐Ÿš€ Dual Analysis Modes: Basic pattern matching or full LLM analysis via OpenRouter
  • ๐Ÿ”— Automatic Report Opening: Direct links to versioned analysis results

Installation

pip install -e .

Environment Setup

Create a .env file or set environment variables:

# For LLM analysis (optional - falls back to pattern matching)
OPENROUTER_API_KEY=your_openrouter_key_here

# For Oxen AI storage (optional - uses local storage if not set)
OXEN_API_KEY=your_oxen_key_here

๐Ÿš€ Quick Start

Natural Language Analysis

# Analyze a website for outdated information
pantsonfire analyze "find outdated API references on python-requests blog posts" --crawl --openrouter --open-report

Traditional Analysis

# Check specific content
pantsonfire --mode external check 
    "https://blog.example.com/outdated-tutorial" 
    "https://docs.example.com/current-api" 
    --crawl --open-report

๐Ÿ“š Oxen AI Integration

Pantsonfire uses Oxen AI for versioned, traceable data storage:

  • Automatic Repository Creation: Each analysis gets its own Oxen repository
  • Versioned Branches: Findings stored in timestamped branches
  • Complete Traceability: All prompts, content, and metadata preserved
  • Web Interface: Direct links to browse analysis results
  • Collaborative: Multiple analysts can contribute to findings

Storage Structure

your-namespace/
โ”œโ”€โ”€ analysis_check_20241023_143052/
โ”‚   โ”œโ”€โ”€ data/
โ”‚   โ”‚   โ”œโ”€โ”€ findings.json
โ”‚   โ”‚   โ””โ”€โ”€ findings.csv
โ”‚   โ”œโ”€โ”€ reports/
โ”‚   โ”‚   โ””โ”€โ”€ findings.txt
โ”‚   โ”œโ”€โ”€ sources/
โ”‚   โ”‚   โ”œโ”€โ”€ extracted_content.txt
โ”‚   โ””โ”€โ”€ metadata/
โ”‚       โ””โ”€โ”€ analysis_metadata.json

Configuration

  1. Get an OpenRouter API key from openrouter.ai/keys
  2. Set your API key:
export OPENROUTER_API_KEY="your_key_here"

Or create a .env file:

cp .env.example .env
# Edit .env with your API key

Usage

Basic Check

Check a blog post against official documentation:

# Internal mode (local files)
pantsonfire check blog_post.md official_docs.md

# External mode (web URLs)
pantsonfire --mode external check https://blog.example.com/old-post https://docs.example.com/current

View Results

# View recent detections
pantsonfire logs

# Export results
pantsonfire export results.json --format json
pantsonfire export results.csv --format csv

Configuration

# Test LLM connection
pantsonfire config --test

# View current config
pantsonfire config

Real-World Example: Oxen AI Documentation Analysis

pantsonfire successfully identified outdated "Get Early Access" references across Oxen AI's website. See oxen-ai-example.md for a complete demonstration.

Contextual Hints

Provide natural language hints to guide the LLM analysis:

pantsonfire check "blog-url" "docs-url" --hints "the beta program ended in 2024 and docs now show the production API"

This helps the LLM focus on specific types of changes you're looking for.

Natural Language Analysis

pantsonfire analyze "the oxen website has outdated get early access buttons for fine tuning, find all similar issues on their site" --openrouter --crawl --open-report

Direct URL Analysis

pantsonfire check "https://www.oxen.ai/entry/fine-tuning-a-with-oxen-ai" \
  "https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune" \
  "https://github.com/Oxen-AI/Oxen" \
  --hints "the early access program is done and the api docs show the ground truth today" \
  --openrouter --open-report

Example Output

๐Ÿ”ฅ ISSUE #1
Blog: unknown
Truth: https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune
Confidence: 0.90
Problem: References 'Get Early Access' which appears to be outdated
Evidence: Official documentation no longer mentions 'Get Early Access'
Time: 2025-10-23T22:52:35

Architecture

  • Factory Pattern: Simple app creation with mode switching
  • Modular Extractors: Separate handling for local vs web content
  • LLM Integration: Structured prompts for factual verification
  • Storage Backends: Extensible result storage (JSON default)

Development

Run tests:

python tests/test_sample.py

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pantsonfire-0.2.9.tar.gz (59.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pantsonfire-0.2.9-py3-none-any.whl (60.6 kB view details)

Uploaded Python 3

File details

Details for the file pantsonfire-0.2.9.tar.gz.

File metadata

  • Download URL: pantsonfire-0.2.9.tar.gz
  • Upload date:
  • Size: 59.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for pantsonfire-0.2.9.tar.gz
Algorithm Hash digest
SHA256 8e374a0902de87a7733e28ac0915db3793a1fa2181629735cadf83f33a7fc0d4
MD5 21770426bde90f408744ee952cc0fb32
BLAKE2b-256 1e681a9d5c6aef7ed94c17ac0794fd950172d164f517c535f68556bd4075df9a

See more details on using hashes here.

File details

Details for the file pantsonfire-0.2.9-py3-none-any.whl.

File metadata

  • Download URL: pantsonfire-0.2.9-py3-none-any.whl
  • Upload date:
  • Size: 60.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for pantsonfire-0.2.9-py3-none-any.whl
Algorithm Hash digest
SHA256 0a0e497407c18f4cf0c68cc64da842fa75e3bd820114b0ec7b6ccedbc01ce54f
MD5 607f89dc7be85d65eac9859047a9b4b2
BLAKE2b-256 ab0d60305d562fa1417f574a3b7e649198615050dece75500c18205e8513843a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page