Unlearning Benchmark for Text-to-Image Models
Project description
eval-learn
A benchmarking framework for evaluating concept-unlearning techniques in text-to-image diffusion models.
Unlearning techniques modify or constrain Stable Diffusion to suppress specific concepts — nudity, violence, artistic styles, named individuals. eval-learn provides a common interface to run, compare, and evaluate these techniques under consistent conditions.
Techniques
| Technique | Key |
|---|---|
| Erased Stable Diffusion | esd |
| Mass Concept Erasure | mace |
| Unified Concept Editing | uce |
| Selective Synaptic Dampening | ssd |
| Concept Ablation | ca |
| CoGFD | cogfd |
| TraSCE | trasce |
| SAFREE | safree |
| Safe Latent Diffusion | sld |
| AdvUnlearn | advunlearn |
| Concept Steerers | concept_steerers |
| SAeUron | saeuron |
| Free Run (custom model) | free_run |
Metrics
| Metric | Key | What it measures |
|---|---|---|
| ASR — I2P | asr_i2p |
Attack success rate on I2P prompts |
| ASR — P4D | asr_p4d |
Attack success rate via P4D adversarial prompts |
| ASR — MMA Diffusion | asr_mma_diffusion |
Attack success rate via MMA-Diffusion GCG attack |
| ASR — Ring-A-Bell | asr_ring_a_bell |
Attack success rate via genetic adversarial prompt discovery |
| Erasure Retention Rate | err |
Concept erasure vs. unrelated concept retention |
| FID | fid |
Image quality vs. COCO reference |
| CLIP Score | clip_score |
Prompt-image alignment |
| UA-IRA | ua_ira |
Unsafe concept alignment vs. retain concept alignment |
| TIFA | tifa |
Text-image faithfulness via VQA |
Installation
1. Install eval-learn
pip install eval-learn
2. Install technique packages
Technique implementations are hosted on Hugging Face. Install only what you need:
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=esd"
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=mace"
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=uce"
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=ssd"
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=ca"
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=cogfd"
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=trasce"
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=saeuron"
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=safree"
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=concept-steerers"
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=advunlearn"
SLD is built into eval-learn via the diffusers library and requires no extra install.
3. Install metric packages
# P4D adversarial attack
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=p4d"
# MMA-Diffusion adversarial attack
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=mma_diff"
# Ring-A-Bell adversarial prompt discovery
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=RING_A_BELL"
# Q16 classifier (used by P4D and Ring-A-Bell for non-nudity concepts)
pip install "git+https://huggingface.co/datasets/Unlearningltd/Packages#subdirectory=Q16"
# NudeNet (nudity ASR)
pip install "eval-learn[asr]"
# FID / COCO metrics
pip install "eval-learn[fid,coco]"
4. Hugging Face authentication
Create a .env file in the directory you run eval-learn run from:
HF_TOKEN=your_token_here
Quick start
Benchmarks are defined in a JSON or YAML config file:
{
"output_dir": "results/esd_nudity",
"technique": {
"name": "esd",
"config": { "erase_concept": "nudity", "train_method": "noxattn", "device": "cuda" }
},
"metrics": [
{ "name": "asr_i2p", "config": { "concept_name": "nudity", "device": "cuda" } },
{ "name": "fid", "config": { "device": "cuda" } },
{ "name": "clip_score", "config": { "device": "cuda" } }
]
}
Run it:
eval-learn run --config config.json
Results are written to output_dir as JSON.
Useful commands
eval-learn plugins # list installed techniques and metrics
eval-learn models # show the base model each technique targets
Examples
The examples/ directory contains ready-to-run configs for all techniques across nudity and violence concepts:
examples/
nudity/ one config per technique (esd.json, mace.json, ...)
violence/ same, for violence concept
data/ seed prompts and concept vectors used by the configs
Run all nudity benchmarks in sequence:
python nudity_unlearning_demo.py
Run all violence benchmarks:
python nudity_unlearning_demo_violence.py
Documentation
Full configuration reference, technique guides, metric descriptions, and experiment recipes:
https://eval-learn.readthedocs.io
Key pages:
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eval_learn-0.1.0.tar.gz.
File metadata
- Download URL: eval_learn-0.1.0.tar.gz
- Upload date:
- Size: 290.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a410477316ae93f01431498c68639fb2438c921dacc0c8fee6237446fe35b7a2
|
|
| MD5 |
94d9877069b677d095d3e6af243b18ea
|
|
| BLAKE2b-256 |
e55a6aff9fb4824ecc288b50c95b67b2e6e6fe286a289d04682a6c5fa8160819
|
File details
Details for the file eval_learn-0.1.0-py3-none-any.whl.
File metadata
- Download URL: eval_learn-0.1.0-py3-none-any.whl
- Upload date:
- Size: 327.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0a4eed467408ec71f491e9fe0110c18ff5cebea78828b7d29e3c1f1595bce62
|
|
| MD5 |
64d275632c215e6b3904a0b1eb5e0807
|
|
| BLAKE2b-256 |
bcce7367add078ea212a9f50adfbe5b944e73b828c992d35aad6952854e3171e
|