Skip to main content

Find wrong information in technical docs online

Project description

pantsonfire ๐Ÿ”ฅ

Find wrong information in technical documentation online. A tool for detecting outdated, incorrect, or deprecated information in blog posts and technical articles by cross-referencing against official documentation.

โœจ Key Features

  • ๐Ÿง  Natural Language Analysis: Use simple English commands like "find outdated API info on tech blogs"
  • ๐Ÿ•ท๏ธ Intelligent Web Crawling: Automatically discover similar issues across entire websites
  • ๐Ÿ“š Oxen AI Integration: Versioned, traceable storage with complete audit trails
  • ๐Ÿ” Multi-Level Detection: Pattern matching + AI-powered analysis for comprehensive coverage
  • ๐ŸŒ Universal Sources: Websites, GitHub repos, documentation sites, local files
  • ๐Ÿ“Š Rich Reporting: Browser-integrated reports with JSON/CSV export
  • ๐Ÿš€ Dual Analysis Modes: Basic pattern matching or full LLM analysis via OpenRouter
  • ๐Ÿ”— Automatic Report Opening: Direct links to versioned analysis results

Installation

pip install -e .

Environment Setup

Create a .env file or set environment variables:

# For LLM analysis (optional - falls back to pattern matching)
OPENROUTER_API_KEY=your_openrouter_key_here

# For Oxen AI storage (optional - uses local storage if not set)
OXEN_API_KEY=your_oxen_key_here

๐Ÿš€ Quick Start

Natural Language Analysis

# Analyze a website for outdated information
pantsonfire analyze "find outdated API references on python-requests blog posts" --crawl --openrouter --open-report

Traditional Analysis

# Check specific content
pantsonfire --mode external check 
    "https://blog.example.com/outdated-tutorial" 
    "https://docs.example.com/current-api" 
    --crawl --open-report

๐Ÿ“š Oxen AI Integration

Pantsonfire uses Oxen AI for versioned, traceable data storage:

  • Automatic Repository Creation: Each analysis gets its own Oxen repository
  • Versioned Branches: Findings stored in timestamped branches
  • Complete Traceability: All prompts, content, and metadata preserved
  • Web Interface: Direct links to browse analysis results
  • Collaborative: Multiple analysts can contribute to findings

Storage Structure

your-namespace/
โ”œโ”€โ”€ analysis_check_20241023_143052/
โ”‚   โ”œโ”€โ”€ data/
โ”‚   โ”‚   โ”œโ”€โ”€ findings.json
โ”‚   โ”‚   โ””โ”€โ”€ findings.csv
โ”‚   โ”œโ”€โ”€ reports/
โ”‚   โ”‚   โ””โ”€โ”€ findings.txt
โ”‚   โ”œโ”€โ”€ sources/
โ”‚   โ”‚   โ”œโ”€โ”€ extracted_content.txt
โ”‚   โ””โ”€โ”€ metadata/
โ”‚       โ””โ”€โ”€ analysis_metadata.json

Configuration

  1. Get an OpenRouter API key from openrouter.ai/keys
  2. Set your API key:
export OPENROUTER_API_KEY="your_key_here"

Or create a .env file:

cp .env.example .env
# Edit .env with your API key

Usage

Basic Check

Check a blog post against official documentation:

# Internal mode (local files)
pantsonfire check blog_post.md official_docs.md

# External mode (web URLs)
pantsonfire --mode external check https://blog.example.com/old-post https://docs.example.com/current

View Results

# View recent detections
pantsonfire logs

# Export results
pantsonfire export results.json --format json
pantsonfire export results.csv --format csv

Configuration

# Test LLM connection
pantsonfire config --test

# View current config
pantsonfire config

Example Output

๐Ÿ”ฅ ISSUE #1
Blog: https://blog.example.com/requests-tutorial
Truth: https://docs.python-requests.org/current/
Confidence: 0.89
Problem: Blog mentions deprecated /v1/ API endpoints
Evidence: Official docs state /v1/ endpoints deprecated since v2.28.0, use /v2/ instead
Time: 2025-10-23T10:30:00

Architecture

  • Factory Pattern: Simple app creation with mode switching
  • Modular Extractors: Separate handling for local vs web content
  • LLM Integration: Structured prompts for factual verification
  • Storage Backends: Extensible result storage (JSON default)

Development

Run tests:

python tests/test_sample.py

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pantsonfire-0.1.1.tar.gz (32.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pantsonfire-0.1.1-py3-none-any.whl (34.4 kB view details)

Uploaded Python 3

File details

Details for the file pantsonfire-0.1.1.tar.gz.

File metadata

  • Download URL: pantsonfire-0.1.1.tar.gz
  • Upload date:
  • Size: 32.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for pantsonfire-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0becd36873c0c2d2e6dc4996e95050c4927fa7d6fc825bb50d70763ecbae027c
MD5 f4276780a5d8f1890bc5710a7bb4f77e
BLAKE2b-256 ec5492a7c882b4100e7d685adeaaf75330648a755c146f134c43774388759909

See more details on using hashes here.

File details

Details for the file pantsonfire-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pantsonfire-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 34.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for pantsonfire-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ce5343cbfbd9c302df855e2a30e218be505397a45096153d4fa1c558fd67a376
MD5 92cc0588405e62953161ee5809b68899
BLAKE2b-256 4e042daf603639f45e0cb15fa775701ea75ba30612fb47b18c65ab21c6738b50

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page