Skip to main content

a robust, modern and high performance Python library for generating image from a html string/html file/url build on top of `playwright`

Project description

snap-html

Build status Python Version Dependencies Status

Code style: black Security: bandit Pre-commit Semantic Versions License

A robust, modern and high-performance Python library for generating images from HTML content. Built on top of playwright for reliability and speed.

📋 Overview

snap-html is a Python library that allows you to generate images from:

  • HTML strings
  • HTML files
  • URLs

Key Features

  • Modern & Complete: Async-ready, sync support, fully typed
  • 🚀 High Performance: Built-in batch generator for processing multiple pages
  • 🔧 Easy Setup: Built on Playwright with simplified browser installation
  • 📐 Precise Output: Support for both pixel and physical dimensions (cm, inches)
  • 🖼️ Flexible Display: Multiple object-fit options (contain, cover, fill)
  • 🖨️ Print-Ready: Combined PrintMediaResolution for precise document sizing
  • 💪 Developer Friendly: Sensible defaults with optional fine-tuning
  • ⏱️ Precise Timing: Custom RENDER_COMPLETE signal for perfect screenshot timing

🧑‍💻 Development

Modern Tooling

This project uses modern Python development tools:

  • Mise: Runtime management and task orchestration
  • uv: Fast Python package management
  • Ruff: Modern code formatting and linting

Setup Development Environment

  1. Clone and setup:

    git clone https://github.com/codustry/snap-html.git
    cd snap-html
    
  2. With Mise (recommended):

    # Install Mise if you don't have it
    curl https://mise.run | sh
    
    # Setup project
    mise install
    mise run install
    
  3. With uv and venv:

    # Create and activate virtual environment
    python -m venv .venv
    source .venv/bin/activate  # On Windows: .venv\Scripts\activate
    
    # Install uv if you don't have it
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
    # Install dependencies
    uv sync
    # With dev dependencies
    uv sync --dev
    

Development Workflow

# Common development tasks with mise
mise run test        # Run tests
mise run format      # Format code
mise run check       # Run style checks
mise run check-safety # Run safety checks
mise run lint        # Run all checks
mise run build       # Build package
mise run publish     # Publish to PyPI

# Virtual environment management
mise run venv        # Create venv
mise run sync        # Install dependencies
mise run sync-dev    # Install dev dependencies

Project Structure

snap-html/
├── .mise.toml            # Mise configuration and tasks
├── .mise/tasks/          # Custom task scripts
├── pyproject.toml        # Python project configuration
├── snap_html/            # Source code
└── tests/                # Test suite

📥 Installation

With uv (recommended):

uv add snap-html

# Install playwright browsers
python -m playwright install

With pip:

pip install -U snap-html
python -m playwright install

System dependencies (Linux only):

# Option 1: Install all required dependencies
sudo playwright install-deps

# Option 2: Manual installation
sudo apt-get install libwoff1 libevent-2.1-7t64 libgstreamer-plugins-base1.0-0 \
  libgstreamer-gl1.0-0 libgstreamer-plugins-bad1.0-0 libenchant-2-2 libsecret-1-0 \
  libhyphen0 libmanette-0.2-0

📸 Usage

Basic Usage

from snap_html import generate_image_sync
from pathlib import Path

# Capture from URL
screenshot = generate_image_sync(
    "https://www.example.com",
    resolution={"width": 1920, "height": 1080},
    output_file=Path("screenshot.png")
)

# Capture with physical dimensions (in centimeters)
screenshot = generate_image_sync(
    "https://www.example.com",
    resolution={
        "cm_width": 21,      # A4 width
        "cm_height": 29.7,   # A4 height
        "dpi": 300          # Print quality DPI
    },
    output_file=Path("high_quality.png")
)

Resolution Options

The library supports multiple ways to specify output resolution:

  1. Pixel Dimensions:
resolution = {
    "width": 1920,    # Width in pixels
    "height": 1080    # Height in pixels
}
  1. Physical Dimensions:
resolution = {
    "cm_width": 21,    # Width in centimeters
    "cm_height": 29.7, # Height in centimeters
    "dpi": 300        # Dots per inch (optional, defaults to 300)
}
  1. Combined Print Media Resolution:
resolution = {
    # Screen dimensions (viewport)
    "width": 1920,     # Width in pixels
    "height": 1080,    # Height in pixels
    
    # Physical dimensions (print)
    "cm_width": 21,    # Width in centimeters
    "cm_height": 29.7, # Height in centimeters
    "dpi": 300,        # Dots per inch (optional, defaults to 300)
    
    # How content should fit within viewport (optional)
    "object_fit": "contain"  # Options: "contain", "cover", "fill", "none"
}

Object Fit Options

When using combined dimensions, you can control how content fits:

  • "contain" (default): Scale to fit while maintaining aspect ratio
  • "cover": Scale to fill while maintaining aspect ratio (may crop)
  • "fill": Stretch to fill (may distort proportions)
  • "none": No scaling applied
# In resolution dictionary
screenshot = generate_image_sync(
    "https://www.example.com",
    resolution={
        "width": 1920, "height": 1080,
        "cm_width": 21, "cm_height": 29.7,
        "dpi": 300,
        "object_fit": "cover"
    }
)

# Or as a separate parameter
screenshot = generate_image_sync(
    "https://www.example.com",
    resolution={
        "width": 1920, "height": 1080,
        "cm_width": 21, "cm_height": 29.7,
        "dpi": 300
    },
    object_fit="contain"
)

Batch Processing

For better performance when capturing multiple screenshots:

from snap_html import generate_image_batch_sync

screenshots = generate_image_batch_sync(
    targets=["https://example1.com", "https://example2.com"],
    resolution={"width": 1920, "height": 1080},
    output_files=["screenshot1.png", "screenshot2.png"]
)

Advanced Examples

from snap_html import generate_image_sync

# With all optional parameters using defaults
screenshot = generate_image_sync("https://www.example.com")

# Print media resolution with object-fit
screenshot = generate_image_sync(
    "https://www.example.com",
    resolution={
        "width": 1920, "height": 1080,
        "cm_width": 21, "cm_height": 29.7,
        "dpi": 300,
        "object_fit": "contain"
    },
    scale_factor=1.5,  # Increase zoom level for sharper text
    render_timeout=10.0  # Wait up to 10 seconds for RENDER_COMPLETE signal
)

# Using object_fit as a separate parameter
screenshot = generate_image_sync(
    "https://www.example.com",
    resolution={"width": 1920, "height": 1080},
    object_fit="cover",  # Content will cover the entire viewport
    scale_factor=1.5
)

Render Complete Signal

For pages with dynamic content or JavaScript animations, you can control exactly when the screenshot is taken using the RENDER_COMPLETE signal:

from snap_html import generate_image_sync

# Specify a longer timeout for waiting for the RENDER_COMPLETE signal
screenshot = generate_image_sync(
    "https://www.example.com",
    render_timeout=15.0  # Wait up to 15 seconds for RENDER_COMPLETE signal
)

In your HTML/JavaScript, add a console log message to signal when rendering is complete:

<script>
  // After your page is fully rendered and ready for screenshot
  window.addEventListener('load', function() {
    // Do any final rendering tasks
    setTimeout(function() {
      console.log('RENDER_COMPLETE');
    }, 500); // Add a small delay if needed
  });
</script>

This is particularly useful for:

  • Pages with dynamic content loading
  • JavaScript animations or transitions
  • Asynchronous data fetching
  • Custom rendering logic

If the RENDER_COMPLETE signal is not received within the specified timeout, snap-html will fall back to using the "networkidle" state to determine when to take the screenshot.

🖥️ CLI Usage

Basic commands:

# Basic usage with pixel dimensions
snap-html https://example.com -o screenshot.png --width 1920 --height 1080

# Using physical dimensions (e.g., A4 paper size)
snap-html input.html --cm-width 21.0 --cm-height 29.7 --dpi 300 -o output.png

# Capture with custom scale factor
snap-html https://example.com -o screenshot.png --width 1024 --height 768 --scale 2.0

# Using both pixel and physical dimensions with object-fit
snap-html https://example.com -o screenshot.png --width 1920 --height 1080 --cm-width 21.0 --cm-height 29.7 --object-fit cover

# Wait for RENDER_COMPLETE signal with custom timeout
snap-html https://example.com -o screenshot.png --timeout 15.0

Available Options:

  -o, --output PATH         Output image file path
  -w, --width INTEGER       Output width in pixels
  -h, --height INTEGER      Output height in pixels
  --cm-width FLOAT          Output width in centimeters
  --cm-height FLOAT         Output height in centimeters
  --dpi INTEGER             DPI for cm-based resolution [default: 300]
  --scale FLOAT             Browser scale factor (zoom level) [default: 1.5]
  --timeout FLOAT           Time to wait for RENDER_COMPLETE signal (in seconds)
  --object-fit TEXT         How content fits viewport: contain, cover, fill, none
  --help                    Show this message and exit.

📦 Releases & Versioning

We follow Semantic Versions specification. You can see all releases on the GitHub Releases page.

🛡️ License

This project is licensed under the terms of the MIT license. See LICENSE for more details.

🔄 Alternatives

  1. html2image

📃 Citation

@misc{snap-html,
  author = {codustry},
  title = {A robust, modern and high performance Python library for generating images from HTML},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/codustry/snap-html}}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snap_html-0.6.0.tar.gz (18.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

snap_html-0.6.0-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file snap_html-0.6.0.tar.gz.

File metadata

  • Download URL: snap_html-0.6.0.tar.gz
  • Upload date:
  • Size: 18.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.3

File hashes

Hashes for snap_html-0.6.0.tar.gz
Algorithm Hash digest
SHA256 485498a90712d9a267d3ebe312604c7acfe24d6d7ad93f829667dcaf0a578cc7
MD5 8c70741d26defab3037a44cff7062213
BLAKE2b-256 371a58d8efe664a7fb7af25806492d5fc4939aef67ccb7c0ffc925d8c6d806a7

See more details on using hashes here.

File details

Details for the file snap_html-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: snap_html-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.3

File hashes

Hashes for snap_html-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 60b280f9a511b7029fdbc2efe6721e9fbfbafedfe53df4beb97c7e4e1204a810
MD5 584809b217cb65cbf9e3b493905631cd
BLAKE2b-256 49365e19ff6a188f278e1696ac2b7f792f2ede99d202aca440f0a9e0c40dbaeb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page