Skip to main content

Find wrong information in technical docs online

Project description

pantsonfire ๐Ÿ”ฅ

Find wrong information in technical documentation online. A tool for detecting outdated, incorrect, or deprecated information in blog posts and technical articles by cross-referencing against official documentation.

โœจ Key Features

  • ๐Ÿง  Natural Language Analysis: Use simple English commands like "find outdated API info on tech blogs"
  • ๐Ÿ•ท๏ธ Intelligent Web Crawling: Automatically discover similar issues across entire websites
  • ๐Ÿ“š Oxen AI Integration: Versioned, traceable storage with complete audit trails
  • ๐Ÿ” Multi-Level Detection: Pattern matching + AI-powered analysis for comprehensive coverage
  • ๐ŸŒ Universal Sources: Websites, GitHub repos, documentation sites, local files
  • ๐Ÿ“Š Rich Reporting: Browser-integrated reports with JSON/CSV export
  • ๐Ÿš€ Dual Analysis Modes: Basic pattern matching or full LLM analysis via OpenRouter
  • ๐Ÿ”— Automatic Report Opening: Direct links to versioned analysis results

Installation

pip install -e .

Environment Setup

Create a .env file or set environment variables:

# For LLM analysis (optional - falls back to pattern matching)
OPENROUTER_API_KEY=your_openrouter_key_here

# For Oxen AI storage (optional - uses local storage if not set)
OXEN_API_KEY=your_oxen_key_here

๐Ÿš€ Quick Start

Natural Language Analysis

# Analyze a website for outdated information
pantsonfire analyze "find outdated API references on python-requests blog posts" --crawl --openrouter --open-report

Traditional Analysis

# Check specific content
pantsonfire --mode external check 
    "https://blog.example.com/outdated-tutorial" 
    "https://docs.example.com/current-api" 
    --crawl --open-report

๐Ÿ“š Oxen AI Integration

Pantsonfire uses Oxen AI for versioned, traceable data storage:

  • Automatic Repository Creation: Each analysis gets its own Oxen repository
  • Versioned Branches: Findings stored in timestamped branches
  • Complete Traceability: All prompts, content, and metadata preserved
  • Web Interface: Direct links to browse analysis results
  • Collaborative: Multiple analysts can contribute to findings

Storage Structure

your-namespace/
โ”œโ”€โ”€ analysis_check_20241023_143052/
โ”‚   โ”œโ”€โ”€ data/
โ”‚   โ”‚   โ”œโ”€โ”€ findings.json
โ”‚   โ”‚   โ””โ”€โ”€ findings.csv
โ”‚   โ”œโ”€โ”€ reports/
โ”‚   โ”‚   โ””โ”€โ”€ findings.txt
โ”‚   โ”œโ”€โ”€ sources/
โ”‚   โ”‚   โ”œโ”€โ”€ extracted_content.txt
โ”‚   โ””โ”€โ”€ metadata/
โ”‚       โ””โ”€โ”€ analysis_metadata.json

Configuration

  1. Get an OpenRouter API key from openrouter.ai/keys
  2. Set your API key:
export OPENROUTER_API_KEY="your_key_here"

Or create a .env file:

cp .env.example .env
# Edit .env with your API key

Usage

Basic Check

Check a blog post against official documentation:

# Internal mode (local files)
pantsonfire check blog_post.md official_docs.md

# External mode (web URLs)
pantsonfire --mode external check https://blog.example.com/old-post https://docs.example.com/current

View Results

# View recent detections
pantsonfire logs

# Export results
pantsonfire export results.json --format json
pantsonfire export results.csv --format csv

Configuration

# Test LLM connection
pantsonfire config --test

# View current config
pantsonfire config

Real-World Example: Oxen AI Documentation Analysis

pantsonfire successfully identified outdated "Get Early Access" references across Oxen AI's website. See oxen-ai-example.md for a complete demonstration.

Contextual Hints

Provide natural language hints to guide the LLM analysis:

pantsonfire check "blog-url" "docs-url" --hints "the beta program ended in 2024 and docs now show the production API"

This helps the LLM focus on specific types of changes you're looking for.

Natural Language Analysis

pantsonfire analyze "the oxen website has outdated get early access buttons for fine tuning, find all similar issues on their site" --openrouter --crawl --open-report

Direct URL Analysis

pantsonfire check "https://www.oxen.ai/entry/fine-tuning-a-with-oxen-ai" \
  "https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune" \
  "https://github.com/Oxen-AI/Oxen" \
  --hints "the early access program is done and the api docs show the ground truth today" \
  --openrouter --open-report

Example Output

๐Ÿ”ฅ ISSUE #1
Blog: unknown
Truth: https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune
Confidence: 0.90
Problem: References 'Get Early Access' which appears to be outdated
Evidence: Official documentation no longer mentions 'Get Early Access'
Time: 2025-10-23T22:52:35

Architecture

  • Factory Pattern: Simple app creation with mode switching
  • Modular Extractors: Separate handling for local vs web content
  • LLM Integration: Structured prompts for factual verification
  • Storage Backends: Extensible result storage (JSON default)

Development

Run tests:

python tests/test_sample.py

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pantsonfire-0.1.3.tar.gz (33.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pantsonfire-0.1.3-py3-none-any.whl (35.6 kB view details)

Uploaded Python 3

File details

Details for the file pantsonfire-0.1.3.tar.gz.

File metadata

  • Download URL: pantsonfire-0.1.3.tar.gz
  • Upload date:
  • Size: 33.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for pantsonfire-0.1.3.tar.gz
Algorithm Hash digest
SHA256 47690425a8f50eadd8c08afd5d9a87c313ef45f5cc4e6037f75ad8dc898b68fc
MD5 283c033dba307f9892c0ed1dafad09e9
BLAKE2b-256 e6161db7f5a86a86331b1eea06e7eae9049bf99528c1851c67cca406ccc4793f

See more details on using hashes here.

File details

Details for the file pantsonfire-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pantsonfire-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 35.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for pantsonfire-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 39d1764d0c461680119be2b2472ca617e39a32f0d6d30dc1d35863a83009ab0e
MD5 5379d6e1898f28a9cccbd230afdba101
BLAKE2b-256 869f9d53833f493690fda6dc9f0fea603b3998adf493008ba4c37e4a6a4ffab8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page