Skip to main content

Detect AI-generated text using GPT-2 token prediction analysis

Project description

findllm

Detect AI-generated text by analyzing token prediction patterns using GPT-2.

Installation

# Run directly with uv
uvx findllm document.md

# or install into your environment
pip install findllm

Usage

# Analyze a file (sentence mode by default)
findllm document.md

# Use token-level analysis
findllm document.md --mode token

# Chunk sentences into smaller batches (e.g., 10 tokens per chunk)
findllm document.md --max-sentence-tokens 10

# Use different aggregation methods (mean, max, l2, rmse, median)
findllm document.md --aggregation max

# JSON output for programmatic use
findllm document.md --json

How It Works

The tool analyzes how predictable each token in the text is according to GPT-2:

  • Green = Low probability (unpredictable, human-like)
  • Yellow = Moderate probability
  • Orange = Higher probability
  • Red = High probability (predictable, AI-like)

AI-generated text tends to follow predictable patterns that language models can easily anticipate. Human writing is more variable and surprising.

Metrics

  • Perplexity: How "surprised" the model is overall (lower = more predictable)
  • Burstiness: Variation in complexity (humans tend to write with more variation)
  • Color Distribution: Percentage of green/yellow/orange/red tokens

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

findllm-0.1.0.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

findllm-0.1.0-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file findllm-0.1.0.tar.gz.

File metadata

  • Download URL: findllm-0.1.0.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.2

File hashes

Hashes for findllm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 33c494a7d3a8068dd49627c2e336ce17ce7c2751afe3449ab3ad91e91ec4d650
MD5 41d2a5062e98b8b98519e65bbdc1cfb4
BLAKE2b-256 c1bc18adc33a115524cbee8a0d6aa030a523beecd848ce1751e8615ffe8ad2aa

See more details on using hashes here.

File details

Details for the file findllm-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: findllm-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.2

File hashes

Hashes for findllm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 11522941e288b8f190acfd250cea2ef20dd9118d4e03ad700f08273f10d94b4e
MD5 91ad2664b45970b8cd162bbf3534e0cc
BLAKE2b-256 c1f99bd5317a2b2aef0c884dfdf50ba4f7cbbe3b37aba4c68b538e3087d8f5c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page