Skip to main content

Find wrong information in technical docs online

Project description

pantsonfire ๐Ÿ”ฅ

Find wrong information in technical documentation online. A tool for detecting outdated, incorrect, or deprecated information in blog posts and technical articles by cross-referencing against official documentation.

โœจ Key Features

  • ๐Ÿง  Natural Language Analysis: Use simple English commands like "find outdated API info on tech blogs"
  • ๐Ÿ•ท๏ธ Intelligent Web Crawling: Automatically discover similar issues across entire websites
  • ๐Ÿ“š Oxen AI Integration: Versioned, traceable storage with complete audit trails
  • ๐Ÿ” Multi-Level Detection: Pattern matching + AI-powered analysis for comprehensive coverage
  • ๐ŸŒ Universal Sources: Websites, GitHub repos, documentation sites, local files
  • ๐Ÿ“Š Rich Reporting: Browser-integrated reports with JSON/CSV export
  • ๐Ÿš€ Dual Analysis Modes: Basic pattern matching or full LLM analysis via OpenRouter
  • ๐Ÿ”— Automatic Report Opening: Direct links to versioned analysis results

Installation

pip install -e .

Environment Setup

Create a .env file or set environment variables:

# For LLM analysis (optional - falls back to pattern matching)
OPENROUTER_API_KEY=your_openrouter_key_here

# For Oxen AI storage (optional - uses local storage if not set)
OXEN_API_KEY=your_oxen_key_here

๐Ÿš€ Quick Start

Natural Language Analysis

# Analyze a website for outdated information
pantsonfire analyze "find outdated API references on python-requests blog posts" --crawl --openrouter --open-report

Traditional Analysis

# Check specific content
pantsonfire --mode external check 
    "https://blog.example.com/outdated-tutorial" 
    "https://docs.example.com/current-api" 
    --crawl --open-report

๐Ÿ“š Oxen AI Integration

Pantsonfire uses Oxen AI for versioned, traceable data storage:

  • Automatic Repository Creation: Each analysis gets its own Oxen repository
  • Versioned Branches: Findings stored in timestamped branches
  • Complete Traceability: All prompts, content, and metadata preserved
  • Web Interface: Direct links to browse analysis results
  • Collaborative: Multiple analysts can contribute to findings

Storage Structure

your-namespace/
โ”œโ”€โ”€ analysis_check_20241023_143052/
โ”‚   โ”œโ”€โ”€ data/
โ”‚   โ”‚   โ”œโ”€โ”€ findings.json
โ”‚   โ”‚   โ””โ”€โ”€ findings.csv
โ”‚   โ”œโ”€โ”€ reports/
โ”‚   โ”‚   โ””โ”€โ”€ findings.txt
โ”‚   โ”œโ”€โ”€ sources/
โ”‚   โ”‚   โ”œโ”€โ”€ extracted_content.txt
โ”‚   โ””โ”€โ”€ metadata/
โ”‚       โ””โ”€โ”€ analysis_metadata.json

Configuration

  1. Get an OpenRouter API key from openrouter.ai/keys
  2. Set your API key:
export OPENROUTER_API_KEY="your_key_here"

Or create a .env file:

cp .env.example .env
# Edit .env with your API key

Usage

Basic Check

Check a blog post against official documentation:

# Internal mode (local files)
pantsonfire check blog_post.md official_docs.md

# External mode (web URLs)
pantsonfire --mode external check https://blog.example.com/old-post https://docs.example.com/current

View Results

# View recent detections
pantsonfire logs

# Export results
pantsonfire export results.json --format json
pantsonfire export results.csv --format csv

Configuration

# Test LLM connection
pantsonfire config --test

# View current config
pantsonfire config

Real-World Example: Oxen AI Documentation Analysis

pantsonfire successfully identified outdated "Get Early Access" references across Oxen AI's website. See oxen-ai-example.md for a complete demonstration.

Contextual Hints

Provide natural language hints to guide the LLM analysis:

pantsonfire check "blog-url" "docs-url" --hints "the beta program ended in 2024 and docs now show the production API"

This helps the LLM focus on specific types of changes you're looking for.

Natural Language Analysis

pantsonfire analyze "the oxen website has outdated get early access buttons for fine tuning, find all similar issues on their site" --openrouter --crawl --open-report

Direct URL Analysis

pantsonfire check "https://www.oxen.ai/entry/fine-tuning-a-with-oxen-ai" \
  "https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune" \
  "https://github.com/Oxen-AI/Oxen" \
  --hints "the early access program is done and the api docs show the ground truth today" \
  --openrouter --open-report

Example Output

๐Ÿ”ฅ ISSUE #1
Blog: unknown
Truth: https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune
Confidence: 0.90
Problem: References 'Get Early Access' which appears to be outdated
Evidence: Official documentation no longer mentions 'Get Early Access'
Time: 2025-10-23T22:52:35

Architecture

  • Factory Pattern: Simple app creation with mode switching
  • Modular Extractors: Separate handling for local vs web content
  • LLM Integration: Structured prompts for factual verification
  • Storage Backends: Extensible result storage (JSON default)

Development

Run tests:

python tests/test_sample.py

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pantsonfire-0.1.9.tar.gz (34.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pantsonfire-0.1.9-py3-none-any.whl (36.5 kB view details)

Uploaded Python 3

File details

Details for the file pantsonfire-0.1.9.tar.gz.

File metadata

  • Download URL: pantsonfire-0.1.9.tar.gz
  • Upload date:
  • Size: 34.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for pantsonfire-0.1.9.tar.gz
Algorithm Hash digest
SHA256 6ddcde2e7de35bf5377b340d873d608546dc83c36cad5037748493be522217df
MD5 4a3efd4377f335cca9aac024319227df
BLAKE2b-256 7413deff37e78ce96bc325f3a7f6493a31e582f6984ce8f73a65222c7d7703c1

See more details on using hashes here.

File details

Details for the file pantsonfire-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: pantsonfire-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 36.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for pantsonfire-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 6b1ef1f5ee1f847dcf22e7ee6220fae867e80593f9c0544010b3b8d6318ecc07
MD5 59a977ea54f2b852dd33c2cc1baa2400
BLAKE2b-256 98242129df86d5eb107c223ea30745754c88b537fdcde96b59bb3b4928a30c07

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page