Find wrong information in technical docs online
Project description
pantsonfire ๐ฅ
Find wrong information in technical documentation online. A tool for detecting outdated, incorrect, or deprecated information in blog posts and technical articles by cross-referencing against official documentation.
โจ Key Features
- ๐ง Natural Language Analysis: Use simple English commands like "find outdated API info on tech blogs"
- ๐ท๏ธ Intelligent Web Crawling: Automatically discover similar issues across entire websites
- ๐ Oxen AI Integration: Versioned, traceable storage with complete audit trails
- ๐ Multi-Level Detection: Pattern matching + AI-powered analysis for comprehensive coverage
- ๐ Universal Sources: Websites, GitHub repos, documentation sites, local files
- ๐ Rich Reporting: Browser-integrated reports with JSON/CSV export
- ๐ Dual Analysis Modes: Basic pattern matching or full LLM analysis via OpenRouter
- ๐ Automatic Report Opening: Direct links to versioned analysis results
Installation
pip install -e .
Environment Setup
Create a .env file or set environment variables:
# For LLM analysis (optional - falls back to pattern matching)
OPENROUTER_API_KEY=your_openrouter_key_here
# For Oxen AI storage (optional - uses local storage if not set)
OXEN_API_KEY=your_oxen_key_here
๐ Quick Start
Natural Language Analysis
# Analyze a website for outdated information
pantsonfire analyze "find outdated API references on python-requests blog posts" --crawl --openrouter --open-report
Traditional Analysis
# Check specific content
pantsonfire --mode external check
"https://blog.example.com/outdated-tutorial"
"https://docs.example.com/current-api"
--crawl --open-report
๐ Oxen AI Integration
Pantsonfire uses Oxen AI for versioned, traceable data storage:
- Automatic Repository Creation: Each analysis gets its own Oxen repository
- Versioned Branches: Findings stored in timestamped branches
- Complete Traceability: All prompts, content, and metadata preserved
- Web Interface: Direct links to browse analysis results
- Collaborative: Multiple analysts can contribute to findings
Storage Structure
your-namespace/
โโโ analysis_check_20241023_143052/
โ โโโ data/
โ โ โโโ findings.json
โ โ โโโ findings.csv
โ โโโ reports/
โ โ โโโ findings.txt
โ โโโ sources/
โ โ โโโ extracted_content.txt
โ โโโ metadata/
โ โโโ analysis_metadata.json
Configuration
- Get an OpenRouter API key from openrouter.ai/keys
- Set your API key:
export OPENROUTER_API_KEY="your_key_here"
Or create a .env file:
cp .env.example .env
# Edit .env with your API key
Usage
Basic Check
Check a blog post against official documentation:
# Internal mode (local files)
pantsonfire check blog_post.md official_docs.md
# External mode (web URLs)
pantsonfire --mode external check https://blog.example.com/old-post https://docs.example.com/current
View Results
# View recent detections
pantsonfire logs
# Export results
pantsonfire export results.json --format json
pantsonfire export results.csv --format csv
Configuration
# Test LLM connection
pantsonfire config --test
# View current config
pantsonfire config
Real-World Example: Oxen AI Documentation Analysis
pantsonfire successfully identified outdated "Get Early Access" references across Oxen AI's website. See oxen-ai-example.md for a complete demonstration.
Contextual Hints
Provide natural language hints to guide the LLM analysis:
pantsonfire check "blog-url" "docs-url" --hints "the beta program ended in 2024 and docs now show the production API"
This helps the LLM focus on specific types of changes you're looking for.
Natural Language Analysis
pantsonfire analyze "the oxen website has outdated get early access buttons for fine tuning, find all similar issues on their site" --openrouter --crawl --open-report
Direct URL Analysis
pantsonfire check "https://www.oxen.ai/entry/fine-tuning-a-with-oxen-ai" \
"https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune" \
"https://github.com/Oxen-AI/Oxen" \
--hints "the early access program is done and the api docs show the ground truth today" \
--openrouter --open-report
Example Output
๐ฅ ISSUE #1
Blog: unknown
Truth: https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune
Confidence: 0.90
Problem: References 'Get Early Access' which appears to be outdated
Evidence: Official documentation no longer mentions 'Get Early Access'
Time: 2025-10-23T22:52:35
Architecture
- Factory Pattern: Simple app creation with mode switching
- Modular Extractors: Separate handling for local vs web content
- LLM Integration: Structured prompts for factual verification
- Storage Backends: Extensible result storage (JSON default)
Development
Run tests:
python tests/test_sample.py
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pantsonfire-0.3.7.tar.gz.
File metadata
- Download URL: pantsonfire-0.3.7.tar.gz
- Upload date:
- Size: 60.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4ddfc2c73e59bba75723cdfdefec9394bc40336c99b2d418a6588c4f5cbe44b
|
|
| MD5 |
59301363eb7042bda8575c23e92c32bb
|
|
| BLAKE2b-256 |
105cf12dd8319c8f361535176dfec06607cdae4d993bee0a5ad545ba51c95f3d
|
File details
Details for the file pantsonfire-0.3.7-py3-none-any.whl.
File metadata
- Download URL: pantsonfire-0.3.7-py3-none-any.whl
- Upload date:
- Size: 61.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a3d54ed2e889ae5ab57a7b890847371762467ba45ba7ebd1104965f046b16ba
|
|
| MD5 |
7f4719a38ae74763bf62d6b2046da58b
|
|
| BLAKE2b-256 |
b68b7314d2ffd62183d2d1bf7b39260b149e96758e154f2ea6bc3ab42edc3105
|