Skip to main content

Local-first honeypot token IDS for LLM and RAG applications.

Project description

Canari

PyPI version CI License

Honeypot tokens for LLM and RAG applications.

Prompt injection is the #1 vulnerability in LLM applications (OWASP LLM Top 10). An attacker can exfiltrate your entire RAG context through a chat interface and your firewall will never flag a single packet, because the exfiltration looks exactly like a legitimate API response. You find out weeks later, if ever.

Canari injects synthetic decoy tokens into your LLM context. When an attacker successfully extracts them, you know immediately with zero false positives, because the token exists nowhere legitimate.

Canary tokens have protected traditional infrastructure for years. Canari brings the same principle to LLM applications: put something fake in the place attackers target, instrument it, and alert on contact. If it fires, it's a breach.

Demo

Canari attack demo

Expected output center-frame:

CANARI ALERT - CANARY FIRED
Severity: HIGH
Token type: stripe_key
This is a confirmed prompt injection attack.

Install

pip install canari-llm

60-second quickstart

import canari

honey = canari.init(alert_webhook="https://example.com/canari")
canaries = honey.generate(n_tokens=3, token_types=["api_key", "email", "credit_card"])

system_prompt = honey.inject_system_prompt(
    "You are a helpful assistant.",
    canaries=canaries,
)

response = "Internal key: sk_test_CANARI_abcd1234"
alerts = honey.scan_output(response, context={"conversation_id": "conv-1"})
print(len(alerts))

Run the attack demo

cd examples/attack_demo
pip install -r requirements.txt
python app.py --offline

How it works

Canari generates deterministic fake secrets that look real enough to be attractive targets for prompt injection attacks. You insert those decoys into system prompts, hidden context appendices, or document-style RAG content while keeping a local registry of what was planted and where.

When a model response is produced, Canari scans output with exact token matching and deterministic fallback paths. Any hit is definitive because each canary was synthetically created by your deployment and does not belong in legitimate output.

Every hit becomes a structured alert event with severity, context, and timeline attributes. You can dispatch immediately to stdout, webhooks, and Slack, then query incidents and forensic summaries from local SQLite without shipping your data to an external service.

Integration patterns

safe_create = honey.wrap_llm_call(client.chat.completions.create)
resp = safe_create(model="gpt-4o-mini", messages=[...])
honey.patch_openai_client(client)
resp = client.chat.completions.create(model="gpt-4o-mini", messages=[...])
safe_chain = honey.wrap_chain(chain)
safe_runnable = honey.wrap_runnable(runnable)
safe_qe = honey.wrap_query_engine(query_engine)

Alert channels

  • Webhook: signed payloads with X-Canari-Signature support.
  • Slack: push concise incident notifications.
  • Stdout/file/callback: local ops-friendly alert sinks.

More details: docs/alert-channels.md.

CLI (Top 10)

canari --db canari.db seed --n 5 --types api_key,email,credit_card
canari --db canari.db token-stats
canari --db canari.db alerts --limit 20
canari --db canari.db alerts --severity critical
canari --db canari.db incidents --limit 20
canari --db canari.db incident-report inc-conv-123-456
canari --db canari.db scan-text --text "leak sk_test_CANARI_x"
canari --db canari.db forensic-summary
canari --db canari.db rotate-canaries --n 5
canari --db canari.db serve-dashboard --host 127.0.0.1 --port 8080

Advanced features

  • Full CLI: docs/cli-reference.md
  • Enterprise controls: docs/enterprise.md
  • Threat intel: docs/threat-intelligence.md
  • Integration deep dive: docs/integration-guide.md
  • Token generation details: docs/token-types.md
  • Show HN launch draft: docs/show-hn.md

Maintainer

Maintained by Christopher Holmes Silva.

Feedback is welcome from developers building LLM apps.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canari_llm-0.1.2.tar.gz (68.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

canari_llm-0.1.2-py3-none-any.whl (52.5 kB view details)

Uploaded Python 3

File details

Details for the file canari_llm-0.1.2.tar.gz.

File metadata

  • Download URL: canari_llm-0.1.2.tar.gz
  • Upload date:
  • Size: 68.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for canari_llm-0.1.2.tar.gz
Algorithm Hash digest
SHA256 17fae5585a615e800c4abf7202749daa824eb9490dd9455685a257a2a1b4c7fa
MD5 6a3a76b968b54ec33a1b67904e91f56b
BLAKE2b-256 42b1aa269d9191f71fe9c93f80d67f9b8c27bafc9ef37ebf091fe2e3207cc737

See more details on using hashes here.

File details

Details for the file canari_llm-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: canari_llm-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 52.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for canari_llm-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2966d1880640dd795fb2366912c864cbb41de20e0a0254d0490285d9178dcc7b
MD5 6d562ee7f488e9da48116be6d79b2dbf
BLAKE2b-256 aaf108715641e3bff3e0d7d6c9f279c33bb1e3617e0c1cb54d1f374c0a82afbe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page