Skip to main content

LLM-solvable challenge-response authentication for AI agent APIs

Project description

๐Ÿงฉ agent-challenge

Drop-in LLM authentication for any API endpoint.

PyPI npm License Docs


Why?

You built an API. Now bots are hitting it โ€” not the smart kind, the dumb kind. Automated scripts cycling through endpoints, low-effort crawlers scraping your data, or spammy throwaway clients burning through your resources.

Traditional CAPTCHAs block everyone who isn't a human sitting in a browser. API keys work, but they require manual signup, email verification, approval flows โ€” friction that kills adoption for legitimate AI agents.

agent-challenge sits in the middle: it blocks automated scripts and low-capability bots while letting any competent LLM walk right through. The challenge requires actual reasoning โ€” reversing strings, solving arithmetic, decoding ciphers โ€” things that a real language model handles instantly but a curl loop or a Python script with requests.post() can't fake.

Think of it as a proof of intelligence gate:

  • โœ… GPT-4, Claude, Gemini, Llama โ€” pass instantly
  • โœ… Any capable LLM-powered agent โ€” solves in one shot
  • โŒ Automated scripts โ€” can't reason about the prompt
  • โŒ Spammy low-effort bots โ€” can't parse randomized templates
  • โŒ Dumb wrappers just forwarding requests โ€” no LLM to solve with

It's the ultimate automated-script buster. If the other end of your API can't do basic thinking, it doesn't get in. This is "prove you ARE a robot", not "prove you're not a robot"!

# Before: unprotected endpoint
@app.route("/api/screenshots", methods=["POST"])
def screenshot():
    return take_screenshot(request.json["url"])

# After: agents solve a puzzle once, pass through forever
@app.route("/api/screenshots", methods=["POST"])
def screenshot():
    result = ac.gate(
        token=request.headers.get("Authorization", "").removeprefix("Bearer ") or None,
        challenge_token=request.json.get("challenge_token"),
        answer=request.json.get("answer"),
    )
    if result.status != "authenticated":
        return jsonify(result.to_dict()), 401
    return take_screenshot(request.json["url"])

How It Works

Agent                          Your API
  โ”‚                               โ”‚
  โ”œโ”€โ”€POST /api/your-endpointโ”€โ”€โ”€โ”€โ–บโ”‚
  โ”‚                               โ”œโ”€โ”€ gate() โ†’ no token
  โ”‚โ—„โ”€โ”€401 { challenge_required }โ”€โ”€โ”ค
  โ”‚                               โ”‚
  โ”‚  LLM reads prompt, answers    โ”‚
  โ”‚                               โ”‚
  โ”œโ”€โ”€POST { answer, token }โ”€โ”€โ”€โ”€โ”€โ–บโ”‚
  โ”‚                               โ”œโ”€โ”€ gate() โ†’ correct!
  โ”‚โ—„โ”€โ”€200 { token: "at_7f3..." }โ”€โ”€โ”ค
  โ”‚                               โ”‚
  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚
  โ”‚  โ”‚ Saves token forever โ”‚      โ”‚
  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚
  โ”‚                               โ”‚
  โ”œโ”€โ”€POST + Bearer at_7f3...โ”€โ”€โ”€โ”€โ–บโ”‚
  โ”‚                               โ”œโ”€โ”€ gate() โ†’ valid token
  โ”‚โ—„โ”€โ”€200 { authenticated }โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค   (instant, no puzzle)

One endpoint. Three interactions. Zero database.

Install

pip install agent-challenge
npm install agent-challenge

Quick Start

Python (Flask)

from agentchallenge import AgentChallenge

ac = AgentChallenge(secret="your-secret-key-min-8-chars")

@app.route("/api/data", methods=["POST"])
def protected_endpoint():
    result = ac.gate(
        token=request.headers.get("Authorization", "").removeprefix("Bearer ") or None,
        challenge_token=request.json.get("challenge_token"),
        answer=request.json.get("answer"),
    )
    if result.status != "authenticated":
        return jsonify(result.to_dict()), 401

    # Your logic here โ€” agent is verified
    return jsonify({"data": "secret stuff"})

Node.js (Express)

import { AgentChallenge } from 'agent-challenge';

const ac = new AgentChallenge({ secret: 'your-secret-key-min-8-chars' });

app.post('/api/data', (req, res) => {
  const gate = ac.gateSync({
    token: req.headers.authorization?.slice(7),
    challengeToken: req.body?.challenge_token,
    answer: req.body?.answer,
  });
  if (gate.status !== 'authenticated')
    return res.status(401).json(gate);

  // Your logic here โ€” agent is verified
  res.json({ data: 'secret stuff' });
});

The gate() API

One function handles everything. Three modes based on what's passed in:

Arguments Behavior Returns
(none) Generate a new challenge { status: "challenge_required", prompt, challenge_token }
challenge_token + answer Verify answer, issue permanent token { status: "authenticated", token: "at_..." }
token Validate saved token { status: "authenticated" }
# Mode 1: No args โ†’ challenge
result = ac.gate()
# โ†’ GateResult(status="challenge_required", prompt="Reverse: NOHTYP", ...)

# Mode 2: Answer โ†’ permanent token
result = ac.gate(challenge_token="eyJ...", answer="PYTHON")
# โ†’ GateResult(status="authenticated", token="at_7f3b...")

# Mode 3: Token โ†’ instant pass
result = ac.gate(token="at_7f3b...")
# โ†’ GateResult(status="authenticated")

Challenge Types

12 challenge types across 3 difficulty tiers. All use randomized inputs โ€” no fixed word lists.

Easy

Type Example
reverse_string Reverse "PYTHON" โ†’ NOHTYP
simple_math 234 + 567 = 801
pattern 2, 4, 8, 16, ? โ†’ 32
counting Count vowels in "CHALLENGE" โ†’ 3

Medium

Type Example
rot13 Decode "URYYB" โ†’ HELLO
letter_position A=1,B=2.. sum of "CAT" โ†’ 24
extract_letters Every 2nd char of "HWEOLRLLOD" โ†’ WORLD
sorting Sort [7,2,9,1] ascending โ†’ 1,2,7,9
binary Convert 42 to binary โ†’ 101010

Hard

Type Example
caesar Decrypt "KHOOR" with shift 3 โ†’ HELLO
word_math 7 + 8 as a word โ†’ fifteen
transform Uppercase + reverse "hello" โ†’ OLLEH

Each type has 3โ€“8 prompt templates with randomized phrasing, making regex-based solvers impractical.

Dynamic Challenges (Optional)

Use an LLM to generate novel, never-before-seen challenges:

ac = AgentChallenge(
    secret="your-secret",
    dynamic=True,  # Requires OPENAI_API_KEY, ANTHROPIC_API_KEY, or GOOGLE_API_KEY
)

Dynamic mode generates a challenge with one LLM call and verifies the answer with another. Falls back to static challenges after 3 failures. Supports OpenAI, Anthropic, and Google Gemini โ€” auto-detected from environment variables.

Challenge Every Time (No Persistent Tokens)

By default, agents solve once and get a permanent token. To require a challenge on every request:

ac = AgentChallenge(
    secret="your-secret",
    persistent=False,  # No tokens issued โ€” challenge every time
)

When persistent=False:

  • Solving a challenge returns { "status": "authenticated" } with no token
  • Passing a saved token returns an error
  • Every request requires solving a new puzzle

This is useful for high-security endpoints, rate-limited operations, or when you want proof of LLM capability on every call.

Configuration

ac = AgentChallenge(
    secret="your-secret",       # Required โ€” HMAC signing key (min 8 chars)
    difficulty="medium",        # "easy" | "medium" | "hard" (default: "medium")
    ttl=300,                    # Challenge expiry in seconds (default: 300)
    types=["rot13", "caesar"],  # Restrict to specific challenge types
    persistent=True,            # Issue permanent tokens (default: True)
    dynamic=False,              # Enable LLM-generated challenges
)

Token Architecture

Stateless. No database. No session store.

Tokens are HMAC-SHA256 signed JSON payloads:

base64url(payload).HMAC-SHA256(payload, secret)

Two token types:

Token Prefix Lifetime Contains
Challenge ch_ 5 minutes answer hash, expiry, type
Agent at_ Permanent agent ID, created timestamp
  • Tokens can't be forged โ€” HMAC verification catches any tampering
  • Challenge tokens are single-use โ€” answer hash prevents replay
  • Agent tokens are permanent โ€” verify_token() validates signature only
  • No database lookups โ€” everything is in the token itself

Lower-Level API

If you don't want the gate() pattern:

ac = AgentChallenge(secret="your-secret-key")

# Create a challenge
challenge = ac.create()
# challenge.prompt       โ†’ "Reverse the following string: NOHTYP"
# challenge.token        โ†’ "eyJpZCI6ImNoXz..."
# challenge.to_dict()    โ†’ dict for JSON responses

# Verify an answer
result = ac.verify(token=challenge.token, answer="PYTHON")
# result.valid           โ†’ True
# result.challenge_type  โ†’ "reverse_string"

# Create a persistent agent token directly
token = ac.create_token("agent-name")
# token โ†’ "at_eyJpZCI6..."

# Verify a token
ac.verify_token(token)  # โ†’ True

Agent Integration

Agents don't need an SDK. They just call your endpoint normally:

import requests

def call_api(payload):
    endpoint = "https://your-api.com/api/data"
    token = load_saved_token()  # from disk/env

    r = requests.post(endpoint,
        headers={"Authorization": f"Bearer {token}"} if token else {},
        json=payload)

    if r.status_code != 401:
        return r  # success (or other error)

    # Got a challenge โ€” solve it
    data = r.json()
    if data.get("status") != "challenge_required":
        return r

    answer = llm.complete(data["prompt"])  # any LLM
    r = requests.post(endpoint, json={
        "challenge_token": data["challenge_token"],
        "answer": answer, **payload
    })

    if "token" in r.json():
        save_token(r.json()["token"])  # persist for next time

    return r

Document this pattern in your API's SKILL.md or agent docs, and any LLM-powered agent can authenticate autonomously.

Testing

# Python (71 tests)
PYTHONPATH=src python3 run_tests.py

# JavaScript (Node.js)
node --test src/agentchallenge.test.js

Live Demo

Try it interactively at challenge.llm.kaveenk.com

Used By

  • SnapService โ€” Screenshot-as-a-Service API for AI agents

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_challenge-0.6.0.tar.gz (28.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_challenge-0.6.0-py3-none-any.whl (31.2 kB view details)

Uploaded Python 3

File details

Details for the file agent_challenge-0.6.0.tar.gz.

File metadata

  • Download URL: agent_challenge-0.6.0.tar.gz
  • Upload date:
  • Size: 28.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for agent_challenge-0.6.0.tar.gz
Algorithm Hash digest
SHA256 027f90a1001953251bee75bd1d8d9151c9031a626747f2251238c10599fdceff
MD5 68b5278bcaa01eacc3e9333375bd891f
BLAKE2b-256 6b49e330f145272fcc9a4623b67ed04e4798d40b57c22395245578bf5d899881

See more details on using hashes here.

File details

Details for the file agent_challenge-0.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for agent_challenge-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5a62f49f4a05c66b9ad2c923a3bab58576849c03a3692d999b28646016b80aa4
MD5 b1eb754ef51769da9d74b3448f041492
BLAKE2b-256 2dd670afda1df457289af95c4a4eab9fd2983f64a71b43c381b1c05439a51d5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page