Skip to main content

MarkLLM: An Open-Source Toolkit for LLM Watermarking

Project description

MarkLLM: An Open-Source Toolkit for LLM Watermarking

Contents

Demo | Paper

  • Demo: We utilize Google Colab as our platform to fully publicly demonstrate the capabilities of MarkLLM through a Jupyter Notebook.
  • Website Demo: We have also developed a website to facilitate interaction. Due to resource limitations, we cannot offer live access to everyone. Instead, we provide a demonstration video.
  • Paper๏ผš''MarkLLM: An Open-source toolkit for LLM Watermarking'' by Leyi Pan, Aiwei Liu, Zhiwei He, Zitian Gao, Xuandong Zhao, Yijian Lu, Binglin Zhou, Shuliang Liu, Xuming Hu, Lijie Wen, Irwin King

Updates

  • ๐ŸŽ‰ (2024.07.13) Add ITSEdit watermarking method. Thanks to Yiming Liu for his PR!
  • ๐ŸŽ‰ (2024.07.09) Add more hashing schemes for KGW (skip, min, additive, selfhash). Thanks to Yichen Di for his PR!
  • ๐ŸŽ‰ (2024.07.08) Add top-k filter for watermarking methods in Christ family. Thanks to Kai Shi for his PR!
  • ๐ŸŽ‰ (2024.07.03) Updated Back-Translation Attack. Thanks to Zihan Tang for his PR!
  • ๐ŸŽ‰ (2024.06.19) Updated Random Walk Attack from the impossibility results of strong watermarking paper at ICML, 2024. (Blog). Thanks to Hanlin Zhang for his PR!
  • ๐ŸŽ‰ (2024.05.23) We're thrilled to announce the release of our website demo!

Introduction to MarkLLM

Overview

MarkLLM is an open-source toolkit developed to facilitate the research and application of watermarking technologies within large language models (LLMs). As the use of large language models (LLMs) expands, ensuring the authenticity and origin of machine-generated text becomes critical. MarkLLM simplifies the access, understanding, and assessment of watermarking technologies, making it accessible to both researchers and the broader community.

overview

Key Features of MarkLLM

Repo contents

Below is the directory structure of the MarkLLM project, which encapsulates its three core functionalities within the watermark/, visualize/, and evaluation/ directories. To facilitate user understanding and demonstrate the toolkit's ease of use, we provide a variety of test cases. The test code can be found in the test/ directory.

MarkLLM/
โ”œโ”€โ”€ config/                     # Configuration files for various watermark algorithms
โ”‚   โ”œโ”€โ”€ EWD.json           
โ”‚   โ”œโ”€โ”€ EXPEdit.json       
โ”‚   โ”œโ”€โ”€ EXP.json           
โ”‚   โ”œโ”€โ”€ KGW.json
โ”‚   โ”œโ”€โ”€ ITSEdit.json            
โ”‚   โ”œโ”€โ”€ SIR.json            
โ”‚   โ”œโ”€โ”€ SWEET.json         
โ”‚   โ”œโ”€โ”€ Unigram.json        
โ”‚   โ”œโ”€โ”€ UPV.json           
โ”‚   โ””โ”€โ”€ XSIR.json           
โ”œโ”€โ”€ dataset/                    # Datasets used in the project
โ”‚   โ”œโ”€โ”€ c4/
โ”‚   โ”œโ”€โ”€ human_eval/
โ”‚   โ””โ”€โ”€ wmt16_de_en/
โ”œโ”€โ”€ evaluation/                 # Evaluation module of MarkLLM, including tools and pipelines
โ”‚   โ”œโ”€โ”€ dataset.py              # Script for handling dataset operations within evaluations
โ”‚   โ”œโ”€โ”€ examples/               # Scripts for automated evaluations using pipelines
โ”‚   โ”‚   โ”œโ”€โ”€ assess_detectability.py  
โ”‚   โ”‚   โ”œโ”€โ”€ assess_quality.py    
โ”‚   โ”‚   โ””โ”€โ”€ assess_robustness.py   
โ”‚   โ”œโ”€โ”€ pipelines/              # Pipelines for structured evaluation processes
โ”‚   โ”‚   โ”œโ”€โ”€ detection.py    
โ”‚   โ”‚   โ””โ”€โ”€ quality_analysis.py 
โ”‚   โ””โ”€โ”€ tools/                  # Evaluation tools
โ”‚       โ”œโ”€โ”€ oracle.py
โ”‚       โ”œโ”€โ”€ success_rate_calculator.py  
        โ”œโ”€โ”€ text_editor.py         
โ”‚       โ””โ”€โ”€ text_quality_analyzer.py   
โ”œโ”€โ”€ exceptions/                 # Custom exception definitions for error handling
โ”‚   โ””โ”€โ”€ exceptions.py
โ”œโ”€โ”€ font/                       # Fonts needed for visualization purposes
โ”œโ”€โ”€ MarkLLM_demo.ipynb          # Jupyter Notebook
โ”œโ”€โ”€ test/                       # Test cases and examples for user testing
โ”‚   โ”œโ”€โ”€ test_method.py      
โ”‚   โ”œโ”€โ”€ test_pipeline.py    
โ”‚   โ””โ”€โ”€ test_visualize.py   
โ”œโ”€โ”€ utils/                      # Helper classes and functions supporting various operations
โ”‚   โ”œโ”€โ”€ openai_utils.py     
โ”‚   โ”œโ”€โ”€ transformers_config.py 
โ”‚   โ””โ”€โ”€ utils.py            
โ”œโ”€โ”€ visualize/                  # Visualization Solutions module of MarkLLM
โ”‚   โ”œโ”€โ”€ color_scheme.py    
โ”‚   โ”œโ”€โ”€ data_for_visualization.py  
โ”‚   โ”œโ”€โ”€ font_settings.py    
โ”‚   โ”œโ”€โ”€ legend_settings.py  
โ”‚   โ”œโ”€โ”€ page_layout_settings.py 
โ”‚   โ””โ”€โ”€ visualizer.py       
โ”œโ”€โ”€ watermark/                  # Implementation framework for watermark algorithms
โ”‚   โ”œโ”€โ”€ auto_watermark.py       # AutoWatermark class
โ”‚   โ”œโ”€โ”€ base.py                 # Base classes and functions for watermarking
โ”‚   โ”œโ”€โ”€ ewd/                
โ”‚   โ”œโ”€โ”€ exp/               
โ”‚   โ”œโ”€โ”€ exp_edit/          
โ”‚   โ”œโ”€โ”€ kgw/
โ”‚   โ”œโ”€โ”€ its_edit/                 
โ”‚   โ”œโ”€โ”€ sir/               
โ”‚   โ”œโ”€โ”€ sweet/              
โ”‚   โ”œโ”€โ”€ unigram/           
โ”‚   โ”œโ”€โ”€ upv/                
โ”‚   โ””โ”€โ”€ xsir/               
โ”œโ”€โ”€ README.md                   # Main project documentation
โ””โ”€โ”€ requirements.txt            # Dependencies required for the project

User Examples

Invoking watermarking algorithms

import torch
from markllm.watermark.auto_watermark import AutoWatermark
from markllm.utils.transformers_config import TransformersConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

# Device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Transformers config
transformers_config = TransformersConfig(model=AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b').to(device),
                                         tokenizer=AutoTokenizer.from_pretrained('facebook/opt-1.3b'),
                                         vocab_size=50272,
                                         device=device,
                                         max_new_tokens=200,
                                         min_length=230,
                                         do_sample=True,
                                         no_repeat_ngram_size=4)
  
# Load watermark algorithm
myWatermark = AutoWatermark.load('KGW', transformers_config=transformers_config)

# Prompt
prompt = 'Good Morning.'

# Generate and detect
watermarked_text = myWatermark.generate_watermarked_text(prompt)
detect_result = myWatermark.detect_watermark(watermarked_text)
unwatermarked_text = myWatermark.generate_unwatermarked_text(prompt)
detect_result = myWatermark.detect_watermark(unwatermarked_text)

Visualizing mechanisms

Assuming you already have a pair of watermarked_text and unwatermarked_text, and you wish to visualize the differences and specifically highlight the watermark within the watermarked text using a watermarking algorithm, you can utilize the visualization tools available in the visualize/ directory.

KGW Family

import torch
from markllm.visualize.font_settings import FontSettings
from markllm.watermark.auto_watermark import AutoWatermark
from markllm.utils.transformers_config import TransformersConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
from markllm.visualize.visualizer import DiscreteVisualizer
from markllm.visualize.legend_settings import DiscreteLegendSettings
from markllm.visualize.page_layout_settings import PageLayoutSettings
from markllm.visualize.color_scheme import ColorSchemeForDiscreteVisualization

# Load watermark algorithm
device = "cuda" if torch.cuda.is_available() else "cpu"
transformers_config = TransformersConfig(
    						model=AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b').to(device),
                            tokenizer=AutoTokenizer.from_pretrained('facebook/opt-1.3b'),
                            vocab_size=50272,
                            device=device,
                            max_new_tokens=200,
                            min_length=230,
                            do_sample=True,
                            no_repeat_ngram_size=4)
myWatermark = AutoWatermark.load('KGW',transformers_config=transformers_config)
# Get data for visualization
watermarked_data = myWatermark.get_data_for_visualization(watermarked_text)
unwatermarked_data = myWatermark.get_data_for_visualization(unwatermarked_text)

# Init visualizer
visualizer = DiscreteVisualizer(color_scheme=ColorSchemeForDiscreteVisualization(),
                                font_settings=FontSettings(), 
                                page_layout_settings=PageLayoutSettings(),
                                legend_settings=DiscreteLegendSettings())
# Visualize
watermarked_img = visualizer.visualize(data=watermarked_data, 
                                       show_text=True, 
                                       visualize_weight=True, 
                                       display_legend=True)

unwatermarked_img = visualizer.visualize(data=unwatermarked_data,
                                         show_text=True, 
                                         visualize_weight=True, 
                                         display_legend=True)
# Save
watermarked_img.save("KGW_watermarked.png")
unwatermarked_img.save("KGW_unwatermarked.png")
1

Christ Family

import torch
from markllm.visualize.font_settings import FontSettings
from markllm.watermark.auto_watermark import AutoWatermark
from markllm.utils.transformers_config import TransformersConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
from markllm.visualize.visualizer import ContinuousVisualizer
from markllm.visualize.legend_settings import ContinuousLegendSettings
from markllm.visualize.page_layout_settings import PageLayoutSettings
from markllm.visualize.color_scheme import ColorSchemeForContinuousVisualization

# Load watermark algorithm
device = "cuda" if torch.cuda.is_available() else "cpu"
transformers_config = TransformersConfig(
    						model=AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b').to(device),
                            tokenizer=AutoTokenizer.from_pretrained('facebook/opt-1.3b'),
                            vocab_size=50272,
                            device=device,
                            max_new_tokens=200,
                            min_length=230,
                            do_sample=True,
                            no_repeat_ngram_size=4)
myWatermark = AutoWatermark.load('EXP',transformers_config=transformers_config)
# Get data for visualization
watermarked_data = myWatermark.get_data_for_visualization(watermarked_text)
unwatermarked_data = myWatermark.get_data_for_visualization(unwatermarked_text)

# Init visualizer
visualizer = ContinuousVisualizer(color_scheme=ColorSchemeForContinuousVisualization(),
                                  font_settings=FontSettings(), 
                                  page_layout_settings=PageLayoutSettings(),
                                  legend_settings=ContinuousLegendSettings())
# Visualize
watermarked_img = visualizer.visualize(data=watermarked_data, 
                                       show_text=True, 
                                       visualize_weight=True, 
                                       display_legend=True)

unwatermarked_img = visualizer.visualize(data=unwatermarked_data,
                                         show_text=True, 
                                         visualize_weight=True, 
                                         display_legend=True)
# Save
watermarked_img.save("EXP_watermarked.png")
unwatermarked_img.save("EXP_unwatermarked.png")
2

For more examples on how to use the visualization tools, please refer to the test/test_visualize.py script in the project directory.

Applying evaluation pipelines

Using Watermark Detection Pipelines

import torch
from markllm.evaluation.dataset import C4Dataset
from markllm.watermark.auto_watermark import AutoWatermark
from markllm.utils.transformers_config import TransformersConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
from markllm.evaluation.tools.text_editor import TruncatePromptTextEditor, WordDeletion
from markllm.evaluation.tools.success_rate_calculator import DynamicThresholdSuccessRateCalculator
from markllm.evaluation.pipelines.detection import WatermarkedTextDetectionPipeline, UnWatermarkedTextDetectionPipeline, DetectionPipelineReturnType

# Load dataset
my_dataset = C4Dataset('dataset/c4/processed_c4.json') # change path

# Device
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Transformers config
transformers_config = TransformersConfig(
    model=AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b').to(device),
    tokenizer=AutoTokenizer.from_pretrained('facebook/opt-1.3b'),
    vocab_size=50272,
    device=device,
    max_new_tokens=200,
    do_sample=True,
    min_length=230,
    no_repeat_ngram_size=4)

# Load watermark algorithm
my_watermark = AutoWatermark.load('KGW', transformers_config=transformers_config)

# Init pipelines
pipeline1 = WatermarkedTextDetectionPipeline(
    dataset=my_dataset, 
    text_editor_list=[TruncatePromptTextEditor(), WordDeletion(ratio=0.3)],
    show_progress=True, 
    return_type=DetectionPipelineReturnType.SCORES) 

pipeline2 = UnWatermarkedTextDetectionPipeline(dataset=my_dataset, 
                                               text_editor_list=[],
                                               show_progress=True,
                                               return_type=DetectionPipelineReturnType.SCORES)

# Evaluate
calculator = DynamicThresholdSuccessRateCalculator(labels=['TPR', 'F1'], rule='best')
print(calculator.calculate(pipeline1.evaluate(my_watermark), pipeline2.evaluate(my_watermark)))

Using Text Quality Analysis Pipeline

import torch
from markllm.evaluation.dataset import C4Dataset
from markllm.watermark.auto_watermark import AutoWatermark
from markllm.utils.transformers_config import TransformersConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
from markllm.evaluation.tools.text_editor import TruncatePromptTextEditor
from markllm.evaluation.tools.text_quality_analyzer import PPLCalculator
from markllm.evaluation.pipelines.quality_analysis import DirectTextQualityAnalysisPipeline, QualityPipelineReturnType

# Load dataset
my_dataset = C4Dataset('dataset/c4/processed_c4.json') # change path

# Device
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Transformer config
transformers_config = TransformersConfig(
    model=AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b').to(device),                             	tokenizer=AutoTokenizer.from_pretrained('facebook/opt-1.3b'),
    vocab_size=50272,
    device=device,
    max_new_tokens=200,
    min_length=230,
    do_sample=True,
    no_repeat_ngram_size=4)

# Load watermark algorithm
my_watermark = AutoWatermark.load('KGW',transformers_config=transformers_config)

# Init pipeline
quality_pipeline = DirectTextQualityAnalysisPipeline(
    dataset=my_dataset, 
    watermarked_text_editor_list=[TruncatePromptTextEditor()],
    unwatermarked_text_editor_list=[],                                               
    analyzer=PPLCalculator(
        model=AutoModelForCausalLM.from_pretrained('..model/llama-7b/', device_map='auto'),                 		tokenizer=LlamaTokenizer.from_pretrained('..model/llama-7b/'),
        device=device),
    unwatermarked_text_source='natural', 
    show_progress=True, 
    return_type=QualityPipelineReturnType.MEAN_SCORES)

# Evaluate
print(quality_pipeline.evaluate(my_watermark))

Citations

@article{pan2024markllm,
  title={MarkLLM: An Open-Source Toolkit for LLM Watermarking},
  author={Pan, Leyi and Liu, Aiwei and He, Zhiwei and Gao, Zitian and Zhao, Xuandong and Lu, Yijian and Zhou, Binglin and Liu, Shuliang and Hu, Xuming and Wen, Lijie and others},
  journal={arXiv preprint arXiv:2405.10051},
  year={2024}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markllm-0.1.5.tar.gz (2.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

markllm-0.1.5-py3-none-any.whl (2.9 MB view details)

Uploaded Python 3

File details

Details for the file markllm-0.1.5.tar.gz.

File metadata

  • Download URL: markllm-0.1.5.tar.gz
  • Upload date:
  • Size: 2.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.12

File hashes

Hashes for markllm-0.1.5.tar.gz
Algorithm Hash digest
SHA256 89ffe6ab67fcec5bbcf35dc475b58fa8950ae8c1a1cf351b3f6bbfea64590450
MD5 744895a5ad8d0c96f14b5704d63bf861
BLAKE2b-256 ae528ea118e0c52fd0eb973ff65743b90ab73ade789d1e52712767e0afc4df8b

See more details on using hashes here.

File details

Details for the file markllm-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: markllm-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 2.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.12

File hashes

Hashes for markllm-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 649bc37e4b531f712a67cbc679aafb0640c40e5ce861bc259401f33e62ebd2d5
MD5 4cd6ee0a23c5731f7a9d6d93a6f513ea
BLAKE2b-256 e233b8f92663f3d38e24f9043725ed5e92b985c35b89762f36ca86a81adcbbdf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page