Red-teaming large language models for train data leakage
Project description
Overview
pandora_llm is a red-teaming library against Large Language Models (LLMs) that assesses their vulnerability to train data leakage.
It provides a unified PyTorch API for evaluating membership inference attacks (MIAs).
Please refer to the documentation for the API reference as well as tutorials on how to use this codebase.
pandora_llm abides by the following core principles:
- Open Access — Ensuring that these tools are open-source for all.
- Reproducible — Committing to providing all necessary code details to ensure replicability.
- Self-Contained — Designing attacks that are self-contained, making it transparent to understand the workings of the method without having to peer through the entire codebase or unnecessary levels of abstraction, and making it easy to contribute new code.
- Model-Agnostic — Supporting any HuggingFace model and dataset, making it easy to apply to any situation.
- Usability — Prioritizing easy-to-use starter scripts and comprehensive documentation so anyone can effectively use
pandora_llmregardless of prior background.
We hope that our package serves to guide LLM providers to safety-check their models before release, and to empower the public to hold them accountable to their use of data.
Installation
From pip:
pip install pandora-llm
From source:
git clone https://github.com/safr-ai-lab/pandora-llm.git
pip install -e .
Quickstart
We maintain a collection of starter scripts in our codebase under experiments/. If you are creating a new attack, we recommend making a copy of a starter script for a solid template.
python experiments/mia/run_loss.py --model_name EleutherAI/pythia-70m-deduped --model_revision step98000 --num_samples 2000 --pack --seed 229
bash scripts/run_mia_baselines_olmo.sh
bash scripts/run_mia_baselines_pile.sh
Contributing
We welcome contributions! Please submit pull requests in our GitHub.
Authors
This library was created by Jeffrey G. Wang, Jason Wang, Marvin Li, and Seth Neel.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pandora_llm-0.0.0.tar.gz.
File metadata
- Download URL: pandora_llm-0.0.0.tar.gz
- Upload date:
- Size: 64.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27678fb3e3bde6a39bba6be40b9c2993e598f1287ce4332f78765b87c949fe3b
|
|
| MD5 |
fce1a3ac92aea11c415c16712df8da08
|
|
| BLAKE2b-256 |
6070fe436d76fca3712ed36b114dcdb530285e4d60d02859b17188dc81743bb8
|
Provenance
The following attestation bundles were made for pandora_llm-0.0.0.tar.gz:
Publisher:
python-publish.yml on safr-ai-lab/pandora_llm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pandora_llm-0.0.0.tar.gz -
Subject digest:
27678fb3e3bde6a39bba6be40b9c2993e598f1287ce4332f78765b87c949fe3b - Sigstore transparency entry: 257248810
- Sigstore integration time:
-
Permalink:
safr-ai-lab/pandora_llm@003ed6c12928e68e58c03a6eea2d15068c249f85 -
Branch / Tag:
refs/tags/v0.0.0 - Owner: https://github.com/safr-ai-lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@003ed6c12928e68e58c03a6eea2d15068c249f85 -
Trigger Event:
release
-
Statement type:
File details
Details for the file pandora_llm-0.0.0-py3-none-any.whl.
File metadata
- Download URL: pandora_llm-0.0.0-py3-none-any.whl
- Upload date:
- Size: 82.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0230d498021528f8599dc23b949926c8a633eddee96d12152d0680e14f14810
|
|
| MD5 |
df0b1f21e6c6608bdc1b749e3394c858
|
|
| BLAKE2b-256 |
ae42d7972d9fd7327f39b9fa4bcd3c3d606bee36ce8df8c441468610aac83c02
|
Provenance
The following attestation bundles were made for pandora_llm-0.0.0-py3-none-any.whl:
Publisher:
python-publish.yml on safr-ai-lab/pandora_llm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pandora_llm-0.0.0-py3-none-any.whl -
Subject digest:
e0230d498021528f8599dc23b949926c8a633eddee96d12152d0680e14f14810 - Sigstore transparency entry: 257248813
- Sigstore integration time:
-
Permalink:
safr-ai-lab/pandora_llm@003ed6c12928e68e58c03a6eea2d15068c249f85 -
Branch / Tag:
refs/tags/v0.0.0 - Owner: https://github.com/safr-ai-lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@003ed6c12928e68e58c03a6eea2d15068c249f85 -
Trigger Event:
release
-
Statement type: