Skip to main content

Automated signature placement for synthetic data generation - designed for creating ML training datasets

Project description

SignLib - Signature Placement for Synthetic Data Generation

PyPI version Python 3.6+ License: MIT

SignLib is a Python library for automatically placing signatures on documents. It is specifically designed for generating synthetic training data for AI and machine learning models. The library processes signature images, removes backgrounds, and intelligently positions them on documents.

Use Cases

  • Generate synthetic signed documents for training machine learning models
  • Create diverse training datasets with varied signature positions and styles
  • Automate document processing pipelines for testing and development
  • Batch process large document collections with consistent signature placement
  • Augment existing datasets with signature variations

Features

  • Automatic background removal from signature images
  • Intelligent positioning to find optimal white space in documents
  • Color adaptation: auto-detect document text color or specify custom colors
  • Automatic scaling based on document dimensions
  • Optional rotation for natural appearance and variation
  • Support for multiple formats: PDF, TIFF, PNG, JPEG
  • Customizable position control with bottom_percent and right_percent parameters
  • High-quality image processing with contrast enhancement

Installation

pip install signlib

Quick Start

Basic Usage

from signlib import create_sign

# Simplest usage - auto-detect signature color
create_sign('document.pdf', 'signature.png')

# Specify output path
create_sign('document.pdf', 'signature.png', output_path='signed_document.pdf')

Custom Color

from signlib import create_sign

# Blue signature
create_sign('document.pdf', 'signature.png', signature_color=(0, 0, 255))

# Black signature
create_sign('document.pdf', 'signature.png', signature_color=(0, 0, 0))

# Dark gray signature
create_sign('document.pdf', 'signature.png', signature_color=(50, 50, 50))

Position Control (New Feature)

from signlib import create_sign

# Search in bottom 40% and right 40% of document
create_sign(
    'document.pdf',
    'signature.png',
    bottom_percent=40,  # Search from bottom 40% upward
    right_percent=40    # Search from right 40% leftward
)

# Place signature in bottom-left area
create_sign(
    'document.pdf',
    'signature.png',
    bottom_percent=30,  # Bottom 30%
    right_percent=70    # Left 70% (starting from right)
)

Advanced Usage

from signlib import create_sign

# Full control over all parameters
create_sign(
    document_path='document.tif',
    sign_path='signature.png',
    output_path='signed.tif',
    signature_color=(0, 0, 100),  # Dark blue
    scale_factor=0.15,             # 15% of document width
    rotation_angle=5.0,            # 5 degrees clockwise
    bottom_percent=25,             # Bottom 25%
    right_percent=50               # Right 50%
)

Batch Processing for Synthetic Training Data

from signlib import create_sign
from pathlib import Path

# Create synthetic training data
doc_folder = Path('documents/')
signature_folder = Path('signatures/')
output_folder = Path('synthetic_data/')
output_folder.mkdir(exist_ok=True)

# Generate diverse signed documents
for doc_file in doc_folder.glob('*.pdf'):
    for sig_file in signature_folder.glob('*.png'):
        output_name = f"{doc_file.stem}_{sig_file.stem}_signed.pdf"
        output_path = output_folder / output_name
        
        create_sign(
            str(doc_file),
            str(sig_file),
            output_path=str(output_path),
            scale_factor=0.12,         # Vary these for diversity
            rotation_angle=0.0,
            bottom_percent=25,
            right_percent=50
        )
        print(f"Generated: {output_name}")

Class-Based Usage (Advanced)

from signlib import SignatureProcessor

processor = SignatureProcessor()

# Step-by-step processing
result = processor.create_sign(
    document_path='document.pdf',
    sign_path='signature.png',
    signature_color=None,      # Auto-detect
    scale_factor=0.12,
    rotation_angle=0.0,
    enhance_contrast=True,
    bottom_percent=25,
    right_percent=50
)

print(f"Signed document: {result}")

API Reference

create_sign() Function

Parameter Type Default Description
document_path str Required Path to document file
sign_path str Required Path to signature file
output_path str None Output file path (auto-generated if None)
signature_color tuple None RGB color (r, g, b). None for auto-detect
scale_factor float 0.12 Signature size ratio (0.12 = 12% of document width)
rotation_angle float 0.0 Rotation angle in degrees
bottom_percent float 25 Search area from bottom (25 = bottom 25%)
right_percent float 50 Search area from right (50 = right 50%)

Position Control

  • bottom_percent: Controls how far from the bottom to search

    • 25 = Search in bottom 25% of document (default, professional)
    • 40 = Search in bottom 40% (more flexible)
    • 50 = Search in bottom 50% (entire lower half)
  • right_percent: Controls how far from the right to search

    • 50 = Search in right 50% of document (default, typical signature position)
    • 40 = Search in rightmost 40% (more to the right)
    • 70 = Search in right 70% (includes left-center area)

Designed for Synthetic Data Generation

SignLib is designed for generating synthetic training data:

  • Consistent Quality: Generate thousands of signed documents with consistent quality
  • Variation Control: Easily control position, size, rotation, and color for data diversity
  • Batch Processing: Process large datasets efficiently
  • Reproducible: Same parameters produce same results for reproducible experiments

Notes

  • Signature files should be in PNG format (for transparent background support)
  • Supported document formats: PDF, TIFF, PNG, JPEG
  • When signature_color=None, the library auto-detects the most common dark color from the document
  • Signatures are typically placed in the bottom-right area within the whitest available space
  • Background is automatically removed and contrast is enhanced

Requirements

  • Python 3.6+
  • Pillow >= 9.0.0
  • NumPy >= 1.20.0

License

MIT License - Free to use in commercial and open-source projects.

Author

Cagri Gungor (@cagrigungor)

Specialized in synthetic data generation for machine learning applications.

Contributing

Contributions are welcome. Please feel free to submit a Pull Request.

Issues

Found a bug or have a feature request? Please open an issue on GitHub.

Example: Generate Training Dataset

import random
from signlib import create_sign
from pathlib import Path

# Generate diverse training dataset
documents = list(Path('documents').glob('*.pdf'))
signatures = list(Path('signatures').glob('*.png'))

for i in range(1000):  # Generate 1000 synthetic samples
    doc = random.choice(documents)
    sig = random.choice(signatures)
    
    # Vary parameters for diversity
    create_sign(
        str(doc),
        str(sig),
        output_path=f'training_data/sample_{i:04d}.pdf',
        scale_factor=random.uniform(0.10, 0.15),
        rotation_angle=random.uniform(-10, 10),
        bottom_percent=random.randint(20, 35),
        right_percent=random.randint(40, 60)
    )

SignLib - Automated signature placement for synthetic data generation and machine learning training datasets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

signlib-1.0.2.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

signlib-1.0.2-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file signlib-1.0.2.tar.gz.

File metadata

  • Download URL: signlib-1.0.2.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for signlib-1.0.2.tar.gz
Algorithm Hash digest
SHA256 31913574037d42759465f1e218dd1164046dbd5b3ebd48a81420b6aba533f7aa
MD5 599da98220dc625f5f2604d43e7e5aef
BLAKE2b-256 459d08e31c307ceaf377f76071f25b000d0fb6cc1018065fa91bcaf77d7e7b7c

See more details on using hashes here.

File details

Details for the file signlib-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: signlib-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for signlib-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1a77aefa1877c4f02daf78ff5b15a601f046506c185b9a98fab3cfd1916542ea
MD5 a80ebe6e94408c0c843393adc2f1ed2b
BLAKE2b-256 f7802c2d2c112ec9a24aeade6fbb28992afe14c58142e29845dcb7a167ebb6fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page