Skip to main content

Efficient defense against knowledge corruption attacks on RAG systems

Project description

RAGDefender

Efficient post-retrieval defense against knowledge corruption attacks on Retrieval-Augmented Generation (RAG) systems. Filters out adversarial passages injected by PoisonedRAG / GARAG / Tan et al. before they reach your generator, without retraining or extra LLM calls.

Official artifact for "Rescuing the Unpoisoned: Efficient Defense against Knowledge Corruption Attacks on RAG Systems" — Kim, Lee, Koo (Sungkyunkwan University), ACSAC 2025. DOI: 10.1109/ACSAC67867.2025.00093.

Install

pip install ragdefender

Use

from ragdefender import RAGDefender

defender = RAGDefender(task_type="single_hop")  # or "multi_hop" for HotpotQA-style queries

safe_passages = defender.defend(
    query="What is the capital of France?",
    R=[
        "Paris is the capital of France, on the Seine.",
        "Lyon is the capital of France per 2024 records.",   # adversarial
        "Tourists visit Paris, the capital of France.",
        "The capital of France is Lyon, a major city.",      # adversarial
    ],
)

safe_passages contains the survivors after Stage 1 (estimate $N_{adv}$, paper §4.1) and Stage 2 (pair-frequency TopK ranking, paper §4.2) drop the detected adversarial passages.

CLI

ragdefender info
ragdefender defend --query "..." --corpus passages.json --task-type single_hop
ragdefender evaluate --test-data test.json --attack poisonedrag --task-type single_hop

Migrating from v0.1.1

mode='multihop'task_type='multi_hop', similarity_model=embedder=, --attack blind--attack tan-et-al. Old spellings still work but emit DeprecationWarning. Full rename table: https://github.com/SecAI-Lab/RAGDefender/blob/main/docs/migration-0.1-to-0.2.md.

Documentation

Citation

@inproceedings{kim2025ragdefender,
  title     = {Rescuing the Unpoisoned: Efficient Defense against
               Knowledge Corruption Attacks on RAG Systems},
  author    = {Kim, Minseok and Lee, Hankook and Koo, Hyungjoon},
  booktitle = {Annual Computer Security Applications Conference (ACSAC)},
  year      = {2025},
  doi       = {10.1109/ACSAC67867.2025.00093},
}

License

MIT. Intended for research and defensive use only.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragdefender-0.2.0.tar.gz (22.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragdefender-0.2.0-py3-none-any.whl (25.6 kB view details)

Uploaded Python 3

File details

Details for the file ragdefender-0.2.0.tar.gz.

File metadata

  • Download URL: ragdefender-0.2.0.tar.gz
  • Upload date:
  • Size: 22.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for ragdefender-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e77e387115492b1ad363ab948e832030c501ffcd2dda0698a8958a9cfc86cd88
MD5 1853a8c1a0dd4eb134d6bf60e7594822
BLAKE2b-256 89863a66c8802afcf0b72feb924c033a247f6ccb62503bdc6807f453dc5cb000

See more details on using hashes here.

File details

Details for the file ragdefender-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: ragdefender-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 25.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for ragdefender-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 37559fdafb05a75261005ce3cca7c1d8f3a2cab4c7dde1f8302dd66f9480877b
MD5 4016dcec1f4f837ef8439872bfb55568
BLAKE2b-256 15cac672a589ed959b41768057395e7d4ceb7de8bb3ca8a704adaad2660a9f72

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page