Skip to main content

Find wrong information in technical docs online

Project description

pantsonfire ๐Ÿ”ฅ

Find wrong information in technical documentation online. A tool for detecting outdated, incorrect, or deprecated information in blog posts and technical articles by cross-referencing against official documentation.

โœจ Key Features

  • ๐Ÿง  Natural Language Analysis: Use simple English commands like "find outdated API info on tech blogs"
  • ๐Ÿ•ท๏ธ Intelligent Web Crawling: Automatically discover similar issues across entire websites
  • ๐Ÿ“š Oxen AI Integration: Versioned, traceable storage with complete audit trails
  • ๐Ÿ” Multi-Level Detection: Pattern matching + AI-powered analysis for comprehensive coverage
  • ๐ŸŒ Universal Sources: Websites, GitHub repos, documentation sites, local files
  • ๐Ÿ“Š Rich Reporting: Browser-integrated reports with JSON/CSV export
  • ๐Ÿš€ Dual Analysis Modes: Basic pattern matching or full LLM analysis via OpenRouter
  • ๐Ÿ”— Automatic Report Opening: Direct links to versioned analysis results

Installation

pip install -e .

Environment Setup

Create a .env file or set environment variables:

# For LLM analysis (optional - falls back to pattern matching)
OPENROUTER_API_KEY=your_openrouter_key_here

# For Oxen AI storage (optional - uses local storage if not set)
OXEN_API_KEY=your_oxen_key_here

๐Ÿš€ Quick Start

Natural Language Analysis

# Analyze a website for outdated information
pantsonfire analyze "find outdated API references on python-requests blog posts" --crawl --openrouter --open-report

Traditional Analysis

# Check specific content
pantsonfire --mode external check 
    "https://blog.example.com/outdated-tutorial" 
    "https://docs.example.com/current-api" 
    --crawl --open-report

๐Ÿ“š Oxen AI Integration

Pantsonfire uses Oxen AI for versioned, traceable data storage:

  • Automatic Repository Creation: Each analysis gets its own Oxen repository
  • Versioned Branches: Findings stored in timestamped branches
  • Complete Traceability: All prompts, content, and metadata preserved
  • Web Interface: Direct links to browse analysis results
  • Collaborative: Multiple analysts can contribute to findings

Storage Structure

your-namespace/
โ”œโ”€โ”€ analysis_check_20241023_143052/
โ”‚   โ”œโ”€โ”€ data/
โ”‚   โ”‚   โ”œโ”€โ”€ findings.json
โ”‚   โ”‚   โ””โ”€โ”€ findings.csv
โ”‚   โ”œโ”€โ”€ reports/
โ”‚   โ”‚   โ””โ”€โ”€ findings.txt
โ”‚   โ”œโ”€โ”€ sources/
โ”‚   โ”‚   โ”œโ”€โ”€ extracted_content.txt
โ”‚   โ””โ”€โ”€ metadata/
โ”‚       โ””โ”€โ”€ analysis_metadata.json

Configuration

  1. Get an OpenRouter API key from openrouter.ai/keys
  2. Set your API key:
export OPENROUTER_API_KEY="your_key_here"

Or create a .env file:

cp .env.example .env
# Edit .env with your API key

Usage

Basic Check

Check a blog post against official documentation:

# Internal mode (local files)
pantsonfire check blog_post.md official_docs.md

# External mode (web URLs)
pantsonfire --mode external check https://blog.example.com/old-post https://docs.example.com/current

View Results

# View recent detections
pantsonfire logs

# Export results
pantsonfire export results.json --format json
pantsonfire export results.csv --format csv

Configuration

# Test LLM connection
pantsonfire config --test

# View current config
pantsonfire config

Real-World Example: Oxen AI Documentation Analysis

pantsonfire successfully identified outdated "Get Early Access" references across Oxen AI's website. See oxen-ai-example.md for a complete demonstration.

Contextual Hints

Provide natural language hints to guide the LLM analysis:

pantsonfire check "blog-url" "docs-url" --hints "the beta program ended in 2024 and docs now show the production API"

This helps the LLM focus on specific types of changes you're looking for.

Natural Language Analysis

pantsonfire analyze "the oxen website has outdated get early access buttons for fine tuning, find all similar issues on their site" --openrouter --crawl --open-report

Direct URL Analysis

pantsonfire check "https://www.oxen.ai/entry/fine-tuning-a-with-oxen-ai" \
  "https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune" \
  "https://github.com/Oxen-AI/Oxen" \
  --hints "the early access program is done and the api docs show the ground truth today" \
  --openrouter --open-report

Example Output

๐Ÿ”ฅ ISSUE #1
Blog: unknown
Truth: https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune
Confidence: 0.90
Problem: References 'Get Early Access' which appears to be outdated
Evidence: Official documentation no longer mentions 'Get Early Access'
Time: 2025-10-23T22:52:35

Architecture

  • Factory Pattern: Simple app creation with mode switching
  • Modular Extractors: Separate handling for local vs web content
  • LLM Integration: Structured prompts for factual verification
  • Storage Backends: Extensible result storage (JSON default)

Development

Run tests:

python tests/test_sample.py

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pantsonfire-0.2.3.tar.gz (58.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pantsonfire-0.2.3-py3-none-any.whl (59.3 kB view details)

Uploaded Python 3

File details

Details for the file pantsonfire-0.2.3.tar.gz.

File metadata

  • Download URL: pantsonfire-0.2.3.tar.gz
  • Upload date:
  • Size: 58.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for pantsonfire-0.2.3.tar.gz
Algorithm Hash digest
SHA256 857ea65fcb6353bbfad2e402e3ba7fa493ecf6a51decb8480625c7cd914a8643
MD5 f16672162e8a8fadc5592801db7e97be
BLAKE2b-256 be2d0cfbe0d74278e5989f133cde585cd38784d095947a9970d9fbf27d8ab8b9

See more details on using hashes here.

File details

Details for the file pantsonfire-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: pantsonfire-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 59.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for pantsonfire-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c8d02184a449c08dcd168415412b229002425dd1e9885e18cca6435b742b3546
MD5 b0090a6bc1b818a58e8e335d77a5d16f
BLAKE2b-256 171ce8096d2e82618a16e35da6dc9645c1880c2f3eb2d76e1eee23feba913fca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page