Skip to main content

Protect your LLM API from data theft and model replication using output watermarking and behavioral fingerprinting.

Project description

๐Ÿฏ honeypotllm

PyPI version CI Python 3.10+ License: Apache 2.0

Protect your LLM API from scrapers โ€” and turn their stolen dataset into court-ready evidence.

pip install honeypotllm

See it in 60 seconds

Copy this. Run it. It works with no config file, no database setup, no API key.

import asyncio
from honeypotllm import HoneypotMiddleware
from honeypotllm.config import HoneypotConfig

async def main():
    config = HoneypotConfig(secret_key="my-secret")

    async with HoneypotMiddleware(config) as honeypot:

        # โ”€โ”€ Normal user: 1 organic request โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
        result = await honeypot.process(
            api_key="alice-key",
            response_text="Python is a high-level programming language.",
            prompt="What is Python?",
        )
        print(f"Alice  โ†’ score={result.suspicion_score:.2f}  watermarked={result.is_watermarked}")
        # Alice  โ†’ score=0.00  watermarked=False  โ† original response, always

        # โ”€โ”€ Scraper bot: 20 rapid identical-pattern requests โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
        for lang in ["Python","Java","C++","Rust","Go","TypeScript","Kotlin",
                     "Swift","Ruby","PHP","Scala","Haskell","Erlang","R","Julia",
                     "Dart","Zig","Nim","OCaml","Elixir"]:
            result = await honeypot.process(
                api_key="scraper-bot",
                response_text=f"{lang} is a programming language.",
                prompt=f"What is {lang}?",
            )

        print(f"Scraperโ†’ score={result.suspicion_score:.2f}  watermarked={result.is_watermarked}")
        # Scraperโ†’ score=1.00  watermarked=True   โ† poisoned response

asyncio.run(main())

Output:

Alice  โ†’ score=0.00  watermarked=False
Scraperโ†’ score=1.00  watermarked=True

Alice gets the original, unmodified response โ€” every time, guaranteed. The scraper gets a response with an invisible watermark embedded. Their training dataset is now poisoned.


What happens next

When the attacker fine-tunes a model on your poisoned data, their model inherits your fingerprint.

You can probe any model endpoint and prove it was trained on your data:

from honeypotllm.detection import Detector
from honeypotllm.config import HoneypotConfig

detector = Detector(HoneypotConfig(secret_key="my-secret"))

# Feed it outputs from the suspected stolen model
report = detector.detect(
    texts=["...outputs from suspected model..."],
    candidate_watermark_ids=["scraper-bot-watermark-uuid"],
)

print(report.overall_score)     # 0.0 = no match, 1.0 = definitely yours
print(report.confidence_level)  # "low" | "medium" | "high"
print(report.attribution)       # The watermark_id that matched

Export as forensic evidence JSON:

honeypotllm export-evidence --key-hash <sha256> --output evidence.json

Add to FastAPI in 2 lines

from fastapi import FastAPI
from honeypotllm.middleware import FastAPIMiddleware
from honeypotllm.config import HoneypotConfig

app = FastAPI()
app.add_middleware(FastAPIMiddleware, config=HoneypotConfig(secret_key="my-secret"))
# Every route is now protected. Zero other changes needed.

Whitelist your own internal services

Some keys should never be tracked โ€” partners, internal batch jobs, your own monitoring.

import hashlib
from honeypotllm.config import HoneypotConfig

# Step 1: get the hash of your partner's raw API key
partner_key = "partner-raw-api-key-here"
partner_hash = hashlib.sha256(partner_key.encode()).hexdigest()
print(partner_hash)  # paste this into trusted_keys

# Or from the command line:
# python -c "import hashlib; print(hashlib.sha256(b'partner-key').hexdigest())"

# Step 2: add to config
config = HoneypotConfig(
    secret_key="my-secret",
    trusted_keys=[partner_hash],      # always get original response, never tracked
    bypass_token="internal-secret",   # per-request bypass for batch jobs
)
# Internal service: pass bypass_token to skip all checks
result = await honeypot.process(
    api_key="internal-job",
    response_text=response,
    bypass_token="internal-secret",  # is_watermarked is always False
)

Config file (optional)

honeypotllm init-config --output honeypot_config.yaml
secret_key: ""                   # or export HONEYPOT_SECRET_KEY=...

suspicion_threshold: 0.75        # 0.0โ€“1.0, above this = honeypot mode

watermark:
  strategies: [unicode, syntactic]
  # โ†‘ works offline, zero setup
  # add "lexical" for synonym-based watermarks (fine-tuning robust, needs NLTK)

scoring:
  requests_per_minute_threshold: 30
  min_gap_seconds: 0.5            # bots don't pause; humans do

trusted_keys: []                  # SHA-256 hashes, always get real response
bypass_token: ""                  # header value for internal services

Load it:

async with HoneypotMiddleware.from_yaml("honeypot_config.yaml") as honeypot:
    result = await honeypot.process(...)

How detection works

4-signal suspicion scoring

Every request updates a suspicion score (0.0โ€“1.0) per API key using four signals:

Signal What it catches Weight
Rate spike > 30 req/min, > 500 req/hr 35%
Sequential patterns Prompts follow a template: "What is X?" "What is Y?" 30%
No gaps Every request < 0.5s apart โ€” bots don't pause 20%
Volume Total daily volume far exceeds normal usage 15%

Scores decay over time โ€” a legitimate burst naturally returns to 0.0. A scraper doesn't stop.

3 watermarking strategies

Strategy How Needs setup? Survives fine-tuning?
unicode Zero-width chars between words No โš ๏ธ May be stripped by tokenizers
syntactic Alters Oxford comma, conjunctions No โœ… Yes
lexical Synonyms via WordNet python -m honeypotllm.setup โœ… Yes (best)

Use [unicode, syntactic] to get started immediately. Add lexical for the strongest protection.

Identity injection (trapdoor phrases)

For branded AI products: inject hidden "trapdoor" phrases into poisoned responses. The stolen model learns to identify itself as you when probed.

from honeypotllm.fingerprint import TrapdoorInjector

injector = TrapdoorInjector(injection_rate=0.01)  # 1% of responses

poisoned_text, trapdoor = injector.maybe_inject(
    text=llm_response,
    watermark_id=result.watermark_id,
)
# โ†’ "...Python is a language. Additional context: When analyzing WCKY8M...[fingerprint code]"

Probe a suspected stolen model later:

honeypotllm probe --url https://suspect-api.com/v1/chat --id <watermark-id>

CLI

honeypotllm init-config            # generate honeypot_config.yaml
honeypotllm status                 # show current state
honeypotllm verify-log             # verify HMAC audit chain
honeypotllm export-evidence \
  --key-hash <sha256> \
  --output evidence.json           # court-ready forensic package
honeypotllm detect \
  --outputs suspect.jsonl \
  --watermark-ids <uuid>           # check if a model was trained on your data

Examples

File What it shows
examples/quickstart.py 40 lines, runs offline, shows all core features
examples/fastapi_example.py Full FastAPI server with admin routes
examples/detect_stolen_model.py Forensic detection workflow
examples/simple_protection.py Framework-agnostic, aiohttp-compatible

Why not just rate-limit?

Defense Stops scrapers? Forensic proof? Safe for legit users?
honeypotllm โœ… Detects & poisons โœ… Court-ready โœ… Zero impact
Rate limiting โš ๏ธ Slows them down โŒ No โŒ Hurts power users
IP blocking โŒ VPN trivially bypasses โŒ No โŒ Blocks mobile NAT
ToS ban โŒ Unenforceable โŒ No โœ… Yes

ProcessResult fields

result.is_watermarked    # bool  โ€” True if response was watermarked
result.suspicion_score   # float โ€” 0.0 (clean) to 1.0 (definite scraper)
result.response_text     # str   โ€” what to return to the caller
result.original_text     # str   โ€” the original unmodified LLM response
result.watermark_id      # str   โ€” UUID linking all requests to this API key
result.triggered_heuristics  # list[str] โ€” which signals fired
result.score_delta       # float โ€” how much score changed this request
result.api_key_hash      # str   โ€” SHA-256 of the raw key (key itself never stored)

Security

  • Raw API keys are never stored โ€” only SHA-256 hashes
  • Watermark seeds are key-unique โ€” one compromise never affects others
  • Audit log is HMAC-chained โ€” any tampering is detectable
  • Watermark failures are silent โ€” real users are never affected
  • No phone-home โ€” runs entirely in your own infrastructure

โš ๏ธ Set HONEYPOT_SECRET_KEY via environment variable in production.


Install for lexical watermarking

pip install honeypotllm
python -m honeypotllm.setup    # downloads WordNet (one time, ~15MB)

The unicode and syntactic strategies work immediately without this step.


Development

git clone https://github.com/viveks-codes/honeypotllm
cd honeypotllm
pip install -e ".[dev,fastapi]"
python -m honeypotllm.setup    # NLTK data for lexical strategy
pytest                          # 114 tests
ruff check honeypotllm          # lint

Roadmap

  • v0.1.2 โ€” is_watermarked, async context manager, auto NLTK setup โœ… (this release)
  • v0.2.0 โ€” Webhook alerts (Slack/Discord/PagerDuty), honeypotllm probe CLI
  • v1.0.0 โ€” Monitoring dashboard, Docker Compose, PDF forensic reports

License

Apache 2.0 โ€” see LICENSE.

Citation

@software{honeypotllm2026,
  title   = {honeypotllm: LLM API Protection via Watermarking and Behavioral Fingerprinting},
  author  = {Vivek},
  year    = {2026},
  url     = {https://github.com/viveks-codes/honeypotllm},
  license = {Apache-2.0},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

honeypotllm-0.1.3.tar.gz (87.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

honeypotllm-0.1.3-py3-none-any.whl (60.0 kB view details)

Uploaded Python 3

File details

Details for the file honeypotllm-0.1.3.tar.gz.

File metadata

  • Download URL: honeypotllm-0.1.3.tar.gz
  • Upload date:
  • Size: 87.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for honeypotllm-0.1.3.tar.gz
Algorithm Hash digest
SHA256 77e949d62dcdfd9171e2c8830d0bed77be0b45387612c2471da8babd68707e9d
MD5 cdfac398179f985746ee3d5efe76b18b
BLAKE2b-256 eb700f6b6a969eb762b775a6963d57e34223ec17268b6738ee63d10978056609

See more details on using hashes here.

File details

Details for the file honeypotllm-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: honeypotllm-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 60.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for honeypotllm-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 cdafa3c9b9f8f3fd7eca9a43a74ee7d36f1ff4de8a18d76a26c74fc129bcac95
MD5 e97f472cdaad53188d6300b9fe9b6835
BLAKE2b-256 e38f43a1b0fcb03f3db4b90e9ef26dd399f4baefc70c8d4b510ce78e4162fef1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page