Skip to main content

A robust latent watermark system for tracking image distribution

Project description

Latent Watermark

A robust latent watermark system for tracking image distribution. This package provides tools for embedding and extracting invisible watermarks in images to track buyer information and prevent unauthorized distribution.

Installation

Install via pip:

pip install latent-watermark

Or install from source:

git clone https://github.com/yourusername/latent-watermark.git
cd latent-watermark
pip install -e .

Quick Start

Embed Watermark

Embed a watermark with buyer information:

# Basic usage
latent_watermark --embed --buyer 'john snow' example/

# With custom output directory
latent_watermark --embed --buyer 'john snow' example/ -o output_example/

# Single file
latent_watermark --embed --buyer 'john snow' image.jpg -o watermarked.jpg

Extract Watermark

Extract watermark information from watermarked files:

# From directory
latent_watermark --extract example/

# From single file
latent_watermark --extract watermarked.jpg

View Configuration

Show current watermark configuration:

latent_watermark --config

Usage Examples

Directory Processing

Process entire directories:

# Embed watermarks in all images in a directory
latent_watermark --embed --buyer 'alice@company.com' photos/

# Extract watermarks from all files in a directory
latent_watermark --extract watermarked_photos/

Complex Buyer Names

Handle special characters and Unicode:

# Unicode names
latent_watermark --embed --buyer '测试用户' images/

# Names with spaces and special characters
latent_watermark --embed --buyer "O'Brien & Co." assets/

# Email addresses
latent_watermark --embed --buyer 'user@example.com' documents/

Command Line Interface

Options

  • --embed: Embed watermark into files
  • --extract: Extract watermark from files
  • --config: Show current configuration
  • --buyer: Buyer name for watermark embedding (required with --embed)
  • -o, --output: Output directory/file path
  • input: Input file or directory path

Examples

# Help
latent_watermark --help

# Embed with output specification
latent_watermark --embed --buyer 'john snow' input.jpg -o watermarked.jpg

# Batch processing
latent_watermark --embed --buyer 'batch_user' images/ -o watermarked_images/

# Extraction
latent_watermark --extract watermarked_images/

Development

Setup Development Environment

git clone https://github.com/yourusername/latent-watermark.git
cd latent-watermark
pip install -e ".[dev]"

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=latent_watermark

# Run specific test files
pytest tests/test_config_formatter.py -v

Code Formatting

# Format code
black .
isort .

# Type checking
mypy latent_watermark/

Architecture

Core Components

  • WatermarkEmbedder: Handles watermark embedding with fixed-length encoding
  • WatermarkExtractor: Extracts watermarks using optimized bit length calculation
  • Configuration: YAML-based configuration system with fixed-length settings
  • CLI: Command-line interface for embedding and extraction
  • Validation: Robust input validation and error handling

Simplified Watermark Format

The system uses a streamlined watermark format optimized for essential information:

  • Format: author:buyer:date:hash (4 fields)
  • Date: 8-digit format yyMMddHH (e.g., 25082000 = Aug 20, 2025, 00:00)
  • Hash: Last 4 digits of MD5 hash for compact verification
  • Author: Optional, falls back to config default
  • Buyer: Mandatory parameter for each watermark
  • Fixed Length: All watermarks padded to consistent length for reliable extraction

Configuration

Configure author and length via YAML:

watermark:
  author: "your_author_name"  # Default author when not specified
  encoding:
    fixed_length: 32  # Fixed length for all watermarks
    max_total_length: 128  # Increased from 32 for better flexibility
  quality:
    d1: 36  # d1/d2 越大鲁棒性越强,但输出图片的失真越大
    d2: 20  # d1/d2 越大鲁棒性越强,但输出图片的失真越大

Quality Configuration

Adjust watermark robustness vs image quality trade-offs:

watermark:
  quality:
    d1: 36  # Higher values increase robustness but may increase image distortion
    d2: 20  # Higher values increase robustness but may increase image distortion
  • d1/d2: Control watermark embedding strength
  • Higher values: More robust against attacks, but may cause visible artifacts
  • Lower values: Less visible artifacts, but potentially less robust
  • Default: d1=36, d2=20 provides good balance

Watermark Format Details

Field Structure

  • Author: 1-16 characters (configurable default)
  • Buyer: 1-16 characters (mandatory parameter)
  • Date: 8 digits exactly (yyMMddHH format)
  • Hash: 4 hex digits exactly (last 4 of MD5)

Example Watermarks

  • john_doe:alice_smith:25082000:7a3f
  • default_author:bob_jones:25082000:e4d2

Usage Examples

# Basic usage
latent_watermark --embed --buyer "alice_smith" images/

# With custom author
latent_watermark --embed --buyer "alice_smith" --author "john_doe" images/

# Extract watermark
latent_watermark --extract watermarked/

Error Handling

The system provides comprehensive error handling:

  • Invalid buyer names: Rejected with clear error messages
  • Malformed watermarks: Detected during extraction
  • Missing files: Handled gracefully with helpful messages
  • Configuration errors: Validated on startup

Security Features

  • Unique buyer tracking: Each watermark contains buyer-specific information
  • Tamper detection: Watermark integrity validation
  • Format validation: Strict format checking prevents injection attacks
  • Unicode support: Handles international buyer names safely

License

MIT License - see LICENSE file for details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Run the test suite
  6. Submit a pull request

Support

For issues and questions:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

latent_watermark-0.1.0.tar.gz (31.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

latent_watermark-0.1.0-py3-none-any.whl (23.6 kB view details)

Uploaded Python 3

File details

Details for the file latent_watermark-0.1.0.tar.gz.

File metadata

  • Download URL: latent_watermark-0.1.0.tar.gz
  • Upload date:
  • Size: 31.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.20

File hashes

Hashes for latent_watermark-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2d545b3b2dec8dc544ac9fc9b04ffebe4994098ed1fc7f4b5e924323de745a16
MD5 d5c6d8cd28643a75607e9e74ca47214c
BLAKE2b-256 eb127450f9882c928e793553a4589feb34457cfa3d684a1598b4218a3a5cd1e4

See more details on using hashes here.

File details

Details for the file latent_watermark-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for latent_watermark-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fe56e9af07a0a4f9ea505828daf6dad06810719f938a6ebd9db5c6520a7a13a3
MD5 d8ae1bf8353de10b425f575ac05dd344
BLAKE2b-256 65a6dd0a3429ca4c7dd0076f9a82e930159689a118f6024efc2851317d34b905

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page