MarkLLM: An Open-Source Toolkit for LLM Watermarking
Project description
MarkLLM: An Open-Source Toolkit for LLM Watermarking
Contents
Demo | Paper
- Demo: We utilize Google Colab as our platform to fully publicly demonstrate the capabilities of MarkLLM through a Jupyter Notebook.
- Website Demo: We have also developed a website to facilitate interaction. Due to resource limitations, we cannot offer live access to everyone. Instead, we provide a demonstration video.
- Paper๏ผ''MarkLLM: An Open-source toolkit for LLM Watermarking'' by Leyi Pan, Aiwei Liu, Zhiwei He, Zitian Gao, Xuandong Zhao, Yijian Lu, Binglin Zhou, Shuliang Liu, Xuming Hu, Lijie Wen, Irwin King
Updates
- ๐ (2024.07.13) Add ITSEdit watermarking method. Thanks to Yiming Liu for his PR!
- ๐ (2024.07.09) Add more hashing schemes for KGW (skip, min, additive, selfhash). Thanks to Yichen Di for his PR!
- ๐ (2024.07.08) Add top-k filter for watermarking methods in Christ family. Thanks to Kai Shi for his PR!
- ๐ (2024.07.03) Updated Back-Translation Attack. Thanks to Zihan Tang for his PR!
- ๐ (2024.06.19) Updated Random Walk Attack from the impossibility results of strong watermarking paper at ICML, 2024. (Blog). Thanks to Hanlin Zhang for his PR!
- ๐ (2024.05.23) We're thrilled to announce the release of our website demo!
Introduction to MarkLLM
Overview
MarkLLM is an open-source toolkit developed to facilitate the research and application of watermarking technologies within large language models (LLMs). As the use of large language models (LLMs) expands, ensuring the authenticity and origin of machine-generated text becomes critical. MarkLLM simplifies the access, understanding, and assessment of watermarking technologies, making it accessible to both researchers and the broader community.
Key Features of MarkLLM
-
Implementation Framework: MarkLLM provides a unified and extensible platform for the implementation of various LLM watermarking algorithms. It currently supports nine specific algorithms from two prominent families, facilitating the integration and expansion of watermarking techniques.
Framework Design:
Currently Supported Algorithms:
-
Visualization Solutions: The toolkit includes custom visualization tools that enable clear and insightful views into how different watermarking algorithms operate under various scenarios. These visualizations help demystify the algorithms' mechanisms, making them more understandable for users.
-
Evaluation Module: With 12 evaluation tools that cover detectability, robustness, and impact on text quality, MarkLLM stands out in its comprehensive approach to assessing watermarking technologies. It also features customizable automated evaluation pipelines that cater to diverse needs and scenarios, enhancing the toolkit's practical utility.
Tools:
- Success Rate Calculator of Watermark Detection: FundamentalSuccessRateCalculator, DynamicThresholdSuccessRateCalculator
- Text Editor: WordDeletion, SynonymSubstitution, ContextAwareSynonymSubstitution, GPTParaphraser, DipperParaphraser, RandomWalkAttack
- Text Quality Analyzer: PPLCalculator, LogDiversityAnalyzer, BLEUCalculator, PassOrNotJudger, GPTDiscriminator
Pipelines:
- Watermark Detection Pipeline: WatermarkedTextDetectionPipeline, UnwatermarkedTextDetectionPipeline
- Text Quality Pipeline: DirectTextQualityAnalysisPipeline, ReferencedTextQualityAnalysisPipeline, ExternalDiscriminatorTextQualityAnalysisPipeline
Repo contents
Below is the directory structure of the MarkLLM project, which encapsulates its three core functionalities within the watermark/
, visualize/
, and evaluation/
directories. To facilitate user understanding and demonstrate the toolkit's ease of use, we provide a variety of test cases. The test code can be found in the test/
directory.
MarkLLM/
โโโ config/ # Configuration files for various watermark algorithms
โ โโโ EWD.json
โ โโโ EXPEdit.json
โ โโโ EXP.json
โ โโโ KGW.json
โ โโโ ITSEdit.json
โ โโโ SIR.json
โ โโโ SWEET.json
โ โโโ Unigram.json
โ โโโ UPV.json
โ โโโ XSIR.json
โโโ dataset/ # Datasets used in the project
โ โโโ c4/
โ โโโ human_eval/
โ โโโ wmt16_de_en/
โโโ evaluation/ # Evaluation module of MarkLLM, including tools and pipelines
โ โโโ dataset.py # Script for handling dataset operations within evaluations
โ โโโ examples/ # Scripts for automated evaluations using pipelines
โ โ โโโ assess_detectability.py
โ โ โโโ assess_quality.py
โ โ โโโ assess_robustness.py
โ โโโ pipelines/ # Pipelines for structured evaluation processes
โ โ โโโ detection.py
โ โ โโโ quality_analysis.py
โ โโโ tools/ # Evaluation tools
โ โโโ oracle.py
โ โโโ success_rate_calculator.py
โโโ text_editor.py
โ โโโ text_quality_analyzer.py
โโโ exceptions/ # Custom exception definitions for error handling
โ โโโ exceptions.py
โโโ font/ # Fonts needed for visualization purposes
โโโ MarkLLM_demo.ipynb # Jupyter Notebook
โโโ test/ # Test cases and examples for user testing
โ โโโ test_method.py
โ โโโ test_pipeline.py
โ โโโ test_visualize.py
โโโ utils/ # Helper classes and functions supporting various operations
โ โโโ openai_utils.py
โ โโโ transformers_config.py
โ โโโ utils.py
โโโ visualize/ # Visualization Solutions module of MarkLLM
โ โโโ color_scheme.py
โ โโโ data_for_visualization.py
โ โโโ font_settings.py
โ โโโ legend_settings.py
โ โโโ page_layout_settings.py
โ โโโ visualizer.py
โโโ watermark/ # Implementation framework for watermark algorithms
โ โโโ auto_watermark.py # AutoWatermark class
โ โโโ base.py # Base classes and functions for watermarking
โ โโโ ewd/
โ โโโ exp/
โ โโโ exp_edit/
โ โโโ kgw/
โ โโโ its_edit/
โ โโโ sir/
โ โโโ sweet/
โ โโโ unigram/
โ โโโ upv/
โ โโโ xsir/
โโโ README.md # Main project documentation
โโโ requirements.txt # Dependencies required for the project
User Examples
Invoking watermarking algorithms
import torch
from markllm.watermark.auto_watermark import AutoWatermark
from markllm.utils.transformers_config import TransformersConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
# Device
device = "cuda" if torch.cuda.is_available() else "cpu"
# Transformers config
transformers_config = TransformersConfig(model=AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b').to(device),
tokenizer=AutoTokenizer.from_pretrained('facebook/opt-1.3b'),
vocab_size=50272,
device=device,
max_new_tokens=200,
min_length=230,
do_sample=True,
no_repeat_ngram_size=4)
# Load watermark algorithm
myWatermark = AutoWatermark.load('KGW', transformers_config=transformers_config)
# Prompt
prompt = 'Good Morning.'
# Generate and detect
watermarked_text = myWatermark.generate_watermarked_text(prompt)
detect_result = myWatermark.detect_watermark(watermarked_text)
unwatermarked_text = myWatermark.generate_unwatermarked_text(prompt)
detect_result = myWatermark.detect_watermark(unwatermarked_text)
Visualizing mechanisms
Assuming you already have a pair of watermarked_text
and unwatermarked_text
, and you wish to visualize the differences and specifically highlight the watermark within the watermarked text using a watermarking algorithm, you can utilize the visualization tools available in the visualize/
directory.
KGW Family
import torch
from markllm.visualize.font_settings import FontSettings
from markllm.watermark.auto_watermark import AutoWatermark
from markllm.utils.transformers_config import TransformersConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
from markllm.visualize.visualizer import DiscreteVisualizer
from markllm.visualize.legend_settings import DiscreteLegendSettings
from markllm.visualize.page_layout_settings import PageLayoutSettings
from markllm.visualize.color_scheme import ColorSchemeForDiscreteVisualization
# Load watermark algorithm
device = "cuda" if torch.cuda.is_available() else "cpu"
transformers_config = TransformersConfig(
model=AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b').to(device),
tokenizer=AutoTokenizer.from_pretrained('facebook/opt-1.3b'),
vocab_size=50272,
device=device,
max_new_tokens=200,
min_length=230,
do_sample=True,
no_repeat_ngram_size=4)
myWatermark = AutoWatermark.load('KGW',transformers_config=transformers_config)
# Get data for visualization
watermarked_data = myWatermark.get_data_for_visualization(watermarked_text)
unwatermarked_data = myWatermark.get_data_for_visualization(unwatermarked_text)
# Init visualizer
visualizer = DiscreteVisualizer(color_scheme=ColorSchemeForDiscreteVisualization(),
font_settings=FontSettings(),
page_layout_settings=PageLayoutSettings(),
legend_settings=DiscreteLegendSettings())
# Visualize
watermarked_img = visualizer.visualize(data=watermarked_data,
show_text=True,
visualize_weight=True,
display_legend=True)
unwatermarked_img = visualizer.visualize(data=unwatermarked_data,
show_text=True,
visualize_weight=True,
display_legend=True)
# Save
watermarked_img.save("KGW_watermarked.png")
unwatermarked_img.save("KGW_unwatermarked.png")
Christ Family
import torch
from markllm.visualize.font_settings import FontSettings
from markllm.watermark.auto_watermark import AutoWatermark
from markllm.utils.transformers_config import TransformersConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
from markllm.visualize.visualizer import ContinuousVisualizer
from markllm.visualize.legend_settings import ContinuousLegendSettings
from markllm.visualize.page_layout_settings import PageLayoutSettings
from markllm.visualize.color_scheme import ColorSchemeForContinuousVisualization
# Load watermark algorithm
device = "cuda" if torch.cuda.is_available() else "cpu"
transformers_config = TransformersConfig(
model=AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b').to(device),
tokenizer=AutoTokenizer.from_pretrained('facebook/opt-1.3b'),
vocab_size=50272,
device=device,
max_new_tokens=200,
min_length=230,
do_sample=True,
no_repeat_ngram_size=4)
myWatermark = AutoWatermark.load('EXP',transformers_config=transformers_config)
# Get data for visualization
watermarked_data = myWatermark.get_data_for_visualization(watermarked_text)
unwatermarked_data = myWatermark.get_data_for_visualization(unwatermarked_text)
# Init visualizer
visualizer = ContinuousVisualizer(color_scheme=ColorSchemeForContinuousVisualization(),
font_settings=FontSettings(),
page_layout_settings=PageLayoutSettings(),
legend_settings=ContinuousLegendSettings())
# Visualize
watermarked_img = visualizer.visualize(data=watermarked_data,
show_text=True,
visualize_weight=True,
display_legend=True)
unwatermarked_img = visualizer.visualize(data=unwatermarked_data,
show_text=True,
visualize_weight=True,
display_legend=True)
# Save
watermarked_img.save("EXP_watermarked.png")
unwatermarked_img.save("EXP_unwatermarked.png")
For more examples on how to use the visualization tools, please refer to the test/test_visualize.py
script in the project directory.
Applying evaluation pipelines
Using Watermark Detection Pipelines
import torch
from markllm.evaluation.dataset import C4Dataset
from markllm.watermark.auto_watermark import AutoWatermark
from markllm.utils.transformers_config import TransformersConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
from markllm.evaluation.tools.text_editor import TruncatePromptTextEditor, WordDeletion
from markllm.evaluation.tools.success_rate_calculator import DynamicThresholdSuccessRateCalculator
from markllm.evaluation.pipelines.detection import WatermarkedTextDetectionPipeline, UnWatermarkedTextDetectionPipeline, DetectionPipelineReturnType
# Load dataset
my_dataset = C4Dataset('dataset/c4/processed_c4.json') # change path
# Device
device = 'cuda' if torch.cuda.is_available() else 'cpu'
# Transformers config
transformers_config = TransformersConfig(
model=AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b').to(device),
tokenizer=AutoTokenizer.from_pretrained('facebook/opt-1.3b'),
vocab_size=50272,
device=device,
max_new_tokens=200,
do_sample=True,
min_length=230,
no_repeat_ngram_size=4)
# Load watermark algorithm
my_watermark = AutoWatermark.load('KGW', transformers_config=transformers_config)
# Init pipelines
pipeline1 = WatermarkedTextDetectionPipeline(
dataset=my_dataset,
text_editor_list=[TruncatePromptTextEditor(), WordDeletion(ratio=0.3)],
show_progress=True,
return_type=DetectionPipelineReturnType.SCORES)
pipeline2 = UnWatermarkedTextDetectionPipeline(dataset=my_dataset,
text_editor_list=[],
show_progress=True,
return_type=DetectionPipelineReturnType.SCORES)
# Evaluate
calculator = DynamicThresholdSuccessRateCalculator(labels=['TPR', 'F1'], rule='best')
print(calculator.calculate(pipeline1.evaluate(my_watermark), pipeline2.evaluate(my_watermark)))
Using Text Quality Analysis Pipeline
import torch
from markllm.evaluation.dataset import C4Dataset
from markllm.watermark.auto_watermark import AutoWatermark
from markllm.utils.transformers_config import TransformersConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
from markllm.evaluation.tools.text_editor import TruncatePromptTextEditor
from markllm.evaluation.tools.text_quality_analyzer import PPLCalculator
from markllm.evaluation.pipelines.quality_analysis import DirectTextQualityAnalysisPipeline, QualityPipelineReturnType
# Load dataset
my_dataset = C4Dataset('dataset/c4/processed_c4.json') # change path
# Device
device = 'cuda' if torch.cuda.is_available() else 'cpu'
# Transformer config
transformers_config = TransformersConfig(
model=AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b').to(device), tokenizer=AutoTokenizer.from_pretrained('facebook/opt-1.3b'),
vocab_size=50272,
device=device,
max_new_tokens=200,
min_length=230,
do_sample=True,
no_repeat_ngram_size=4)
# Load watermark algorithm
my_watermark = AutoWatermark.load('KGW',transformers_config=transformers_config)
# Init pipeline
quality_pipeline = DirectTextQualityAnalysisPipeline(
dataset=my_dataset,
watermarked_text_editor_list=[TruncatePromptTextEditor()],
unwatermarked_text_editor_list=[],
analyzer=PPLCalculator(
model=AutoModelForCausalLM.from_pretrained('..model/llama-7b/', device_map='auto'), tokenizer=LlamaTokenizer.from_pretrained('..model/llama-7b/'),
device=device),
unwatermarked_text_source='natural',
show_progress=True,
return_type=QualityPipelineReturnType.MEAN_SCORES)
# Evaluate
print(quality_pipeline.evaluate(my_watermark))
Citations
@article{pan2024markllm,
title={MarkLLM: An Open-Source Toolkit for LLM Watermarking},
author={Pan, Leyi and Liu, Aiwei and He, Zhiwei and Gao, Zitian and Zhao, Xuandong and Lu, Yijian and Zhou, Binglin and Liu, Shuliang and Hu, Xuming and Wen, Lijie and others},
journal={arXiv preprint arXiv:2405.10051},
year={2024}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.