Protect your LLM API from data theft and model replication using output watermarking and behavioral fingerprinting.

These details have not been verified by PyPI

Project links

Project description

🍯 honeypotllm

Protect your LLM API from scrapers — and turn their stolen dataset into court-ready evidence.

pip install honeypotllm

See it in 60 seconds

Copy this. Run it. It works with no config file, no database setup, no API key.

import asyncio
from honeypotllm import HoneypotMiddleware
from honeypotllm.config import HoneypotConfig

async def main():
    config = HoneypotConfig(secret_key="my-secret")

    async with HoneypotMiddleware(config) as honeypot:

        # ── Normal user: 1 organic request ──────────────────────────────────
        result = await honeypot.process(
            api_key="alice-key",
            response_text="Python is a high-level programming language.",
            prompt="What is Python?",
        )
        print(f"Alice  → score={result.suspicion_score:.2f}  watermarked={result.is_watermarked}")
        # Alice  → score=0.00  watermarked=False  ← original response, always

        # ── Scraper bot: 20 rapid identical-pattern requests ─────────────────
        for lang in ["Python","Java","C++","Rust","Go","TypeScript","Kotlin",
                     "Swift","Ruby","PHP","Scala","Haskell","Erlang","R","Julia",
                     "Dart","Zig","Nim","OCaml","Elixir"]:
            result = await honeypot.process(
                api_key="scraper-bot",
                response_text=f"{lang} is a programming language.",
                prompt=f"What is {lang}?",
            )

        print(f"Scraper→ score={result.suspicion_score:.2f}  watermarked={result.is_watermarked}")
        # Scraper→ score=1.00  watermarked=True   ← poisoned response

asyncio.run(main())

Output:

Alice  → score=0.00  watermarked=False
Scraper→ score=1.00  watermarked=True

Alice gets the original, unmodified response — every time, guaranteed. The scraper gets a response with an invisible watermark embedded. Their training dataset is now poisoned.

What happens next

When the attacker fine-tunes a model on your poisoned data, their model inherits your fingerprint.

You can probe any model endpoint and prove it was trained on your data:

from honeypotllm.detection import Detector
from honeypotllm.config import HoneypotConfig

detector = Detector(HoneypotConfig(secret_key="my-secret"))

# Feed it outputs from the suspected stolen model
report = detector.detect(
    texts=["...outputs from suspected model..."],
    candidate_watermark_ids=["scraper-bot-watermark-uuid"],
)

print(report.overall_score)     # 0.0 = no match, 1.0 = definitely yours
print(report.confidence_level)  # "low" | "medium" | "high"
print(report.attribution)       # The watermark_id that matched

Export as forensic evidence JSON:

honeypotllm export-evidence --key-hash <sha256> --output evidence.json

Add to FastAPI in 2 lines

from fastapi import FastAPI
from honeypotllm.middleware import FastAPIMiddleware
from honeypotllm.config import HoneypotConfig

app = FastAPI()
app.add_middleware(FastAPIMiddleware, config=HoneypotConfig(secret_key="my-secret"))
# Every route is now protected. Zero other changes needed.

Whitelist your own internal services

Some keys should never be tracked — partners, internal batch jobs, your own monitoring.

import hashlib
from honeypotllm.config import HoneypotConfig

# Step 1: get the hash of your partner's raw API key
partner_key = "partner-raw-api-key-here"
partner_hash = hashlib.sha256(partner_key.encode()).hexdigest()
print(partner_hash)  # paste this into trusted_keys

# Or from the command line:
# python -c "import hashlib; print(hashlib.sha256(b'partner-key').hexdigest())"

# Step 2: add to config
config = HoneypotConfig(
    secret_key="my-secret",
    trusted_keys=[partner_hash],      # always get original response, never tracked
    bypass_token="internal-secret",   # per-request bypass for batch jobs
)

# Internal service: pass bypass_token to skip all checks
result = await honeypot.process(
    api_key="internal-job",
    response_text=response,
    bypass_token="internal-secret",  # is_watermarked is always False
)

Config file (optional)

honeypotllm init-config --output honeypot_config.yaml

secret_key: ""                   # or export HONEYPOT_SECRET_KEY=...

suspicion_threshold: 0.75        # 0.0–1.0, above this = honeypot mode

watermark:
  strategies: [unicode, syntactic]
  # ↑ works offline, zero setup
  # add "lexical" for synonym-based watermarks (fine-tuning robust, needs NLTK)

scoring:
  requests_per_minute_threshold: 30
  min_gap_seconds: 0.5            # bots don't pause; humans do

trusted_keys: []                  # SHA-256 hashes, always get real response
bypass_token: ""                  # header value for internal services

Load it:

async with HoneypotMiddleware.from_yaml("honeypot_config.yaml") as honeypot:
    result = await honeypot.process(...)

How detection works

4-signal suspicion scoring

Every request updates a suspicion score (0.0–1.0) per API key using four signals:

Signal	What it catches	Weight
Rate spike	> 30 req/min, > 500 req/hr	35%
Sequential patterns	Prompts follow a template: "What is X?" "What is Y?"	30%
No gaps	Every request < 0.5s apart — bots don't pause	20%
Volume	Total daily volume far exceeds normal usage	15%

Scores decay over time — a legitimate burst naturally returns to 0.0. A scraper doesn't stop.

3 watermarking strategies

Strategy	How	Needs setup?	Survives fine-tuning?
`unicode`	Zero-width chars between words	No	⚠️ May be stripped by tokenizers
`syntactic`	Alters Oxford comma, conjunctions	No	✅ Yes
`lexical`	Synonyms via WordNet	`python -m honeypotllm.setup`	✅ Yes (best)

Use [unicode, syntactic] to get started immediately. Add lexical for the strongest protection.

Identity injection (trapdoor phrases)

For branded AI products: inject hidden "trapdoor" phrases into poisoned responses. The stolen model learns to identify itself as you when probed.

from honeypotllm.fingerprint import TrapdoorInjector

injector = TrapdoorInjector(injection_rate=0.01)  # 1% of responses

poisoned_text, trapdoor = injector.maybe_inject(
    text=llm_response,
    watermark_id=result.watermark_id,
)
# → "...Python is a language. Additional context: When analyzing WCKY8M...[fingerprint code]"

Probe a suspected stolen model later:

honeypotllm probe --url https://suspect-api.com/v1/chat --id <watermark-id>

CLI

honeypotllm init-config            # generate honeypot_config.yaml
honeypotllm status                 # show current state
honeypotllm verify-log             # verify HMAC audit chain
honeypotllm export-evidence \
  --key-hash <sha256> \
  --output evidence.json           # court-ready forensic package
honeypotllm detect \
  --outputs suspect.jsonl \
  --watermark-ids <uuid>           # check if a model was trained on your data

Examples

File	What it shows
`examples/quickstart.py`	40 lines, runs offline, shows all core features
`examples/fastapi_example.py`	Full FastAPI server with admin routes
`examples/detect_stolen_model.py`	Forensic detection workflow
`examples/simple_protection.py`	Framework-agnostic, aiohttp-compatible

Why not just rate-limit?

Defense	Stops scrapers?	Forensic proof?	Safe for legit users?
honeypotllm	✅ Detects & poisons	✅ Court-ready	✅ Zero impact
Rate limiting	⚠️ Slows them down	❌ No	❌ Hurts power users
IP blocking	❌ VPN trivially bypasses	❌ No	❌ Blocks mobile NAT
ToS ban	❌ Unenforceable	❌ No	✅ Yes

ProcessResult fields

result.is_watermarked    # bool  — True if response was watermarked
result.suspicion_score   # float — 0.0 (clean) to 1.0 (definite scraper)
result.response_text     # str   — what to return to the caller
result.original_text     # str   — the original unmodified LLM response
result.watermark_id      # str   — UUID linking all requests to this API key
result.triggered_heuristics  # list[str] — which signals fired
result.score_delta       # float — how much score changed this request
result.api_key_hash      # str   — SHA-256 of the raw key (key itself never stored)

Security

Raw API keys are never stored — only SHA-256 hashes
Watermark seeds are key-unique — one compromise never affects others
Audit log is HMAC-chained — any tampering is detectable
Watermark failures are silent — real users are never affected
No phone-home — runs entirely in your own infrastructure

⚠️ Set HONEYPOT_SECRET_KEY via environment variable in production.

Install for lexical watermarking

pip install honeypotllm
python -m honeypotllm.setup    # downloads WordNet (one time, ~15MB)

The unicode and syntactic strategies work immediately without this step.

Development

git clone https://github.com/viveks-codes/honeypotllm
cd honeypotllm
pip install -e ".[dev,fastapi]"
python -m honeypotllm.setup    # NLTK data for lexical strategy
pytest                          # 114 tests
ruff check honeypotllm          # lint

Roadmap

v0.1.2 — is_watermarked, async context manager, auto NLTK setup ✅ (this release)
v0.2.0 — Webhook alerts (Slack/Discord/PagerDuty), honeypotllm probe CLI
v1.0.0 — Monitoring dashboard, Docker Compose, PDF forensic reports

License

Apache 2.0 — see LICENSE.

Citation

@software{honeypotllm2026,
  title   = {honeypotllm: LLM API Protection via Watermarking and Behavioral Fingerprinting},
  author  = {Vivek},
  year    = {2026},
  url     = {https://github.com/viveks-codes/honeypotllm},
  license = {Apache-2.0},
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.3

Apr 4, 2026

0.1.1

Apr 4, 2026

0.1.0

Apr 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

honeypotllm-0.1.3.tar.gz (87.7 kB view details)

Uploaded Apr 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

honeypotllm-0.1.3-py3-none-any.whl (60.0 kB view details)

Uploaded Apr 4, 2026 Python 3

File details

Details for the file honeypotllm-0.1.3.tar.gz.

File metadata

Download URL: honeypotllm-0.1.3.tar.gz
Upload date: Apr 4, 2026
Size: 87.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for honeypotllm-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`77e949d62dcdfd9171e2c8830d0bed77be0b45387612c2471da8babd68707e9d`
MD5	`cdfac398179f985746ee3d5efe76b18b`
BLAKE2b-256	`eb700f6b6a969eb762b775a6963d57e34223ec17268b6738ee63d10978056609`

See more details on using hashes here.

File details

Details for the file honeypotllm-0.1.3-py3-none-any.whl.

File metadata

Download URL: honeypotllm-0.1.3-py3-none-any.whl
Upload date: Apr 4, 2026
Size: 60.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for honeypotllm-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cdafa3c9b9f8f3fd7eca9a43a74ee7d36f1ff4de8a18d76a26c74fc129bcac95`
MD5	`e97f472cdaad53188d6300b9fe9b6835`
BLAKE2b-256	`e38f43a1b0fcb03f3db4b90e9ef26dd399f4baefc70c8d4b510ce78e4162fef1`

See more details on using hashes here.

honeypotllm 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🍯 honeypotllm

See it in 60 seconds

What happens next

Add to FastAPI in 2 lines

Whitelist your own internal services

Config file (optional)

How detection works

4-signal suspicion scoring

3 watermarking strategies

Identity injection (trapdoor phrases)

CLI

Examples

Why not just rate-limit?

ProcessResult fields

Security

Install for lexical watermarking

Development

Roadmap

License

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes