Protect your LLM API from data theft and model replication using output watermarking and behavioral fingerprinting.
Project description
๐ฏ honeypotllm
Protect your LLM API from scrapers โ and turn their stolen dataset into court-ready evidence.
pip install honeypotllm
See it in 60 seconds
Copy this. Run it. It works with no config file, no database setup, no API key.
import asyncio
from honeypotllm import HoneypotMiddleware
from honeypotllm.config import HoneypotConfig
async def main():
config = HoneypotConfig(secret_key="my-secret")
async with HoneypotMiddleware(config) as honeypot:
# โโ Normal user: 1 organic request โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
result = await honeypot.process(
api_key="alice-key",
response_text="Python is a high-level programming language.",
prompt="What is Python?",
)
print(f"Alice โ score={result.suspicion_score:.2f} watermarked={result.is_watermarked}")
# Alice โ score=0.00 watermarked=False โ original response, always
# โโ Scraper bot: 20 rapid identical-pattern requests โโโโโโโโโโโโโโโโโ
for lang in ["Python","Java","C++","Rust","Go","TypeScript","Kotlin",
"Swift","Ruby","PHP","Scala","Haskell","Erlang","R","Julia",
"Dart","Zig","Nim","OCaml","Elixir"]:
result = await honeypot.process(
api_key="scraper-bot",
response_text=f"{lang} is a programming language.",
prompt=f"What is {lang}?",
)
print(f"Scraperโ score={result.suspicion_score:.2f} watermarked={result.is_watermarked}")
# Scraperโ score=1.00 watermarked=True โ poisoned response
asyncio.run(main())
Output:
Alice โ score=0.00 watermarked=False
Scraperโ score=1.00 watermarked=True
Alice gets the original, unmodified response โ every time, guaranteed. The scraper gets a response with an invisible watermark embedded. Their training dataset is now poisoned.
What happens next
When the attacker fine-tunes a model on your poisoned data, their model inherits your fingerprint.
You can probe any model endpoint and prove it was trained on your data:
from honeypotllm.detection import Detector
from honeypotllm.config import HoneypotConfig
detector = Detector(HoneypotConfig(secret_key="my-secret"))
# Feed it outputs from the suspected stolen model
report = detector.detect(
texts=["...outputs from suspected model..."],
candidate_watermark_ids=["scraper-bot-watermark-uuid"],
)
print(report.overall_score) # 0.0 = no match, 1.0 = definitely yours
print(report.confidence_level) # "low" | "medium" | "high"
print(report.attribution) # The watermark_id that matched
Export as forensic evidence JSON:
honeypotllm export-evidence --key-hash <sha256> --output evidence.json
Add to FastAPI in 2 lines
from fastapi import FastAPI
from honeypotllm.middleware import FastAPIMiddleware
from honeypotllm.config import HoneypotConfig
app = FastAPI()
app.add_middleware(FastAPIMiddleware, config=HoneypotConfig(secret_key="my-secret"))
# Every route is now protected. Zero other changes needed.
Whitelist your own internal services
Some keys should never be tracked โ partners, internal batch jobs, your own monitoring.
import hashlib
from honeypotllm.config import HoneypotConfig
# Step 1: get the hash of your partner's raw API key
partner_key = "partner-raw-api-key-here"
partner_hash = hashlib.sha256(partner_key.encode()).hexdigest()
print(partner_hash) # paste this into trusted_keys
# Or from the command line:
# python -c "import hashlib; print(hashlib.sha256(b'partner-key').hexdigest())"
# Step 2: add to config
config = HoneypotConfig(
secret_key="my-secret",
trusted_keys=[partner_hash], # always get original response, never tracked
bypass_token="internal-secret", # per-request bypass for batch jobs
)
# Internal service: pass bypass_token to skip all checks
result = await honeypot.process(
api_key="internal-job",
response_text=response,
bypass_token="internal-secret", # is_watermarked is always False
)
Config file (optional)
honeypotllm init-config --output honeypot_config.yaml
secret_key: "" # or export HONEYPOT_SECRET_KEY=...
suspicion_threshold: 0.75 # 0.0โ1.0, above this = honeypot mode
watermark:
strategies: [unicode, syntactic]
# โ works offline, zero setup
# add "lexical" for synonym-based watermarks (fine-tuning robust, needs NLTK)
scoring:
requests_per_minute_threshold: 30
min_gap_seconds: 0.5 # bots don't pause; humans do
trusted_keys: [] # SHA-256 hashes, always get real response
bypass_token: "" # header value for internal services
Load it:
async with HoneypotMiddleware.from_yaml("honeypot_config.yaml") as honeypot:
result = await honeypot.process(...)
How detection works
4-signal suspicion scoring
Every request updates a suspicion score (0.0โ1.0) per API key using four signals:
| Signal | What it catches | Weight |
|---|---|---|
| Rate spike | > 30 req/min, > 500 req/hr | 35% |
| Sequential patterns | Prompts follow a template: "What is X?" "What is Y?" | 30% |
| No gaps | Every request < 0.5s apart โ bots don't pause | 20% |
| Volume | Total daily volume far exceeds normal usage | 15% |
Scores decay over time โ a legitimate burst naturally returns to 0.0. A scraper doesn't stop.
3 watermarking strategies
| Strategy | How | Needs setup? | Survives fine-tuning? |
|---|---|---|---|
unicode |
Zero-width chars between words | No | โ ๏ธ May be stripped by tokenizers |
syntactic |
Alters Oxford comma, conjunctions | No | โ Yes |
lexical |
Synonyms via WordNet | python -m honeypotllm.setup |
โ Yes (best) |
Use [unicode, syntactic] to get started immediately. Add lexical for the strongest protection.
Identity injection (trapdoor phrases)
For branded AI products: inject hidden "trapdoor" phrases into poisoned responses. The stolen model learns to identify itself as you when probed.
from honeypotllm.fingerprint import TrapdoorInjector
injector = TrapdoorInjector(injection_rate=0.01) # 1% of responses
poisoned_text, trapdoor = injector.maybe_inject(
text=llm_response,
watermark_id=result.watermark_id,
)
# โ "...Python is a language. Additional context: When analyzing WCKY8M...[fingerprint code]"
Probe a suspected stolen model later:
honeypotllm probe --url https://suspect-api.com/v1/chat --id <watermark-id>
CLI
honeypotllm init-config # generate honeypot_config.yaml
honeypotllm status # show current state
honeypotllm verify-log # verify HMAC audit chain
honeypotllm export-evidence \
--key-hash <sha256> \
--output evidence.json # court-ready forensic package
honeypotllm detect \
--outputs suspect.jsonl \
--watermark-ids <uuid> # check if a model was trained on your data
Examples
| File | What it shows |
|---|---|
examples/quickstart.py |
40 lines, runs offline, shows all core features |
examples/fastapi_example.py |
Full FastAPI server with admin routes |
examples/detect_stolen_model.py |
Forensic detection workflow |
examples/simple_protection.py |
Framework-agnostic, aiohttp-compatible |
Why not just rate-limit?
| Defense | Stops scrapers? | Forensic proof? | Safe for legit users? |
|---|---|---|---|
| honeypotllm | โ Detects & poisons | โ Court-ready | โ Zero impact |
| Rate limiting | โ ๏ธ Slows them down | โ No | โ Hurts power users |
| IP blocking | โ VPN trivially bypasses | โ No | โ Blocks mobile NAT |
| ToS ban | โ Unenforceable | โ No | โ Yes |
ProcessResult fields
result.is_watermarked # bool โ True if response was watermarked
result.suspicion_score # float โ 0.0 (clean) to 1.0 (definite scraper)
result.response_text # str โ what to return to the caller
result.original_text # str โ the original unmodified LLM response
result.watermark_id # str โ UUID linking all requests to this API key
result.triggered_heuristics # list[str] โ which signals fired
result.score_delta # float โ how much score changed this request
result.api_key_hash # str โ SHA-256 of the raw key (key itself never stored)
Security
- Raw API keys are never stored โ only SHA-256 hashes
- Watermark seeds are key-unique โ one compromise never affects others
- Audit log is HMAC-chained โ any tampering is detectable
- Watermark failures are silent โ real users are never affected
- No phone-home โ runs entirely in your own infrastructure
โ ๏ธ Set
HONEYPOT_SECRET_KEYvia environment variable in production.
Install for lexical watermarking
pip install honeypotllm
python -m honeypotllm.setup # downloads WordNet (one time, ~15MB)
The unicode and syntactic strategies work immediately without this step.
Development
git clone https://github.com/viveks-codes/honeypotllm
cd honeypotllm
pip install -e ".[dev,fastapi]"
python -m honeypotllm.setup # NLTK data for lexical strategy
pytest # 114 tests
ruff check honeypotllm # lint
Roadmap
- v0.1.2 โ
is_watermarked, async context manager, auto NLTK setup โ (this release) - v0.2.0 โ Webhook alerts (Slack/Discord/PagerDuty),
honeypotllm probeCLI - v1.0.0 โ Monitoring dashboard, Docker Compose, PDF forensic reports
License
Apache 2.0 โ see LICENSE.
Citation
@software{honeypotllm2026,
title = {honeypotllm: LLM API Protection via Watermarking and Behavioral Fingerprinting},
author = {Vivek},
year = {2026},
url = {https://github.com/viveks-codes/honeypotllm},
license = {Apache-2.0},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file honeypotllm-0.1.3.tar.gz.
File metadata
- Download URL: honeypotllm-0.1.3.tar.gz
- Upload date:
- Size: 87.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77e949d62dcdfd9171e2c8830d0bed77be0b45387612c2471da8babd68707e9d
|
|
| MD5 |
cdfac398179f985746ee3d5efe76b18b
|
|
| BLAKE2b-256 |
eb700f6b6a969eb762b775a6963d57e34223ec17268b6738ee63d10978056609
|
File details
Details for the file honeypotllm-0.1.3-py3-none-any.whl.
File metadata
- Download URL: honeypotllm-0.1.3-py3-none-any.whl
- Upload date:
- Size: 60.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cdafa3c9b9f8f3fd7eca9a43a74ee7d36f1ff4de8a18d76a26c74fc129bcac95
|
|
| MD5 |
e97f472cdaad53188d6300b9fe9b6835
|
|
| BLAKE2b-256 |
e38f43a1b0fcb03f3db4b90e9ef26dd399f4baefc70c8d4b510ce78e4162fef1
|