Generic screenshot capture utilities (models, services, CLI helpers).

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

infra-screenshot

infra-screenshot provides reusable models, services, and CLI helpers for capturing website screenshots. It exposes the core abstractions.

infra-screenshot

Features

🎭 Multiple backends: Support for both Playwright and Selenium
📸 Flexible capture: Single screenshots or batch processing
🔧 Configurable viewports: Desktop, mobile, and custom viewport sizes
💾 Storage abstractions: Local filesystem or cloud storage backends
🚀 Async/await support: Modern async architecture for better performance
🛠️ CLI tools: Ready-to-use command-line interface
🔄 Retry logic: Built-in retry with exponential backoff for reliability
🎨 Visual cleanup: Auto-hide overlays, disable animations for cleaner screenshots

Installation

Using uv (Recommended)

# Install with Playwright backend (includes bundled Chromium)
uv pip install "infra-screenshot[playwright]"
uv run playwright install chromium

# OR install with Selenium backend (requires system Chrome)
uv pip install "infra-screenshot[selenium]"

Using pip

# Install with Playwright backend
pip install "infra-screenshot[playwright]"
playwright install chromium

# OR install with Selenium backend
pip install "infra-screenshot[selenium]"

Quick Verification

# Check installation
screenshot local -h

# Test capture
screenshot local --urls https://example.com --output-dir ./test-screenshots

Usage

CLI: Local Screenshot Capture

The CLI provides a local subcommand for capturing screenshots locally.

Basic Examples

# Capture a single URL
screenshot local --urls https://example.com --output-dir ./screenshots

# Capture multiple URLs (repeat the --urls flag for each URL)
screenshot local \
  --urls https://example.com \
  --urls https://github.com \
  --output-dir ./screenshots

# For many URLs, use a JSONL input file (recommended)
screenshot local --input urls.jsonl --output-dir ./screenshots

# Capture with custom settings
screenshot local \
  --urls http://localhost:3000 \
  --output-dir ./screenshots \
  --viewports desktop mobile \
  --depth 0 \
  --scroll false \
  --allow-autoplay true

Input File Format (JSONL)

For batch processing, create a file with one JSON object per line:

{"url": "https://example.com", "job_id": "example"}
{"url": "https://github.com", "job_id": "github"}
{"url": "https://docs.python.org", "job_id": "python-docs"}

Then run:

screenshot local --input urls.jsonl --output-dir ./screenshots

Common Options

Option	Description	Default
`--viewports`	Viewport presets (desktop, mobile, tablet)	`desktop`
`--depth`	Link depth to follow (0 = single page only)	`1`
`--scroll`	Enable scrolling before capture	`true`
`--full-page`	Capture entire page height (not just viewport)	`true`
`--timeout-s`	Page load timeout in seconds	`60`
`--post-nav-wait-s`	Wait after navigation (settling time)	`6`
`--pre-capture-wait-s`	Wait before screenshot	`2.5`
`--hide-overlays`	Auto-hide popups/cookie banners	`true`
`--disable-animations`	Disable CSS animations for cleaner shots	`true`
`--allow-autoplay`	Allow media autoplay	`true`
`--mute-media`	Mute audio/video	`true`
`--block-media`	Block video/audio requests	`false`
`--site-concurrency`	Number of sites to capture in parallel	`1`
`--max-pages`	Max pages per site (when following links)	`5`

See all options:

screenshot local -h

Real-World Examples

Capture homepage only (no scrolling, viewport-only):

screenshot local \
  --urls http://localhost:3000 \
  --output-dir ./tmp \
  --depth 0 \
  --scroll false \
  --full-page false

Full-page screenshot with scrolling:

screenshot local \
  --urls http://localhost:3000 \
  --output-dir ./tmp \
  --depth 0 \
  --scroll true \
  --full-page true

Capture multiple viewports:

screenshot local \
  --urls https://example.com \
  --output-dir ./screenshots \
  --viewports desktop mobile tablet

Python API: Programmatic Usage

For integration into your own tooling, call the async runner directly with a configured ScreenshotOptions payload:

from pathlib import Path
import asyncio

from screenshot import ScreenshotOptions, capture_screenshots_async
from screenshot.models import CaptureOptions

async def capture_example() -> None:
    options = ScreenshotOptions(
        capture=CaptureOptions(
            enabled=True,
            viewports=("desktop",),
            depth=0,
            scroll=False,
        )
    )

    result = await capture_screenshots_async(
        "demo-job",
        "https://example.com",
        store_dir=Path("screenshots"),
        partition_date=None,
        options=options,
    )

    if result.succeeded:
        print(f"Captured {result.captured} screenshot(s)")
    else:
        for error in result.errors:
            print(f"Capture failed: {error.message}")

asyncio.run(capture_example())

Architecture Refresh

The models layer is split into focused files to keep each responsibility self-contained:

_models_options.py holds configuration dataclasses (CaptureOptions, BrowserCompatOptions, RunnerOptions) and the from_dict/to_dict helpers used throughout the CLI.
_models_plans.py defines the plan containers (CapturePlan, BrowserPlan, RunnerPlan, SanitizedPlans) that capture runners consume.
_models_results.py provides job/result/resource dataclasses (ScreenshotJob, ScreenshotCaptureResult, ScreenshotBatchResult, etc.), the ErrorCategory enum, and serialization helpers.

ScreenshotService/ScreenshotBackend are generic over a bounded ScreenshotCaptureResult, which lets custom backends return richer subclasses while preserving strong typing. CLI helpers now emit typed ScreenshotJobSpec structures, keeping the input schema in sync with any future service APIs.

Need to orchestrate multiple jobs, custom storage, or cancellation tokens? See the documentation for configuration reference and migration guides.

Mutability Notes

Options, plans, and resource dataclasses are frozen/slot-backed to guarantee immutability for the core configuration layers, while fields meant to be updated (e.g., ScreenshotJob.metadata, ScreenshotResourceResult.entries, ScreenshotBatchResult.results) remain regular dataclasses with collection defaults. When extending the models, mark new mutation points clearly so downstream consumers understand which parts of the payload can change during capture or aggregation.

Browser Setup

Playwright: Bundled Chromium vs System Chrome

By default, Playwright uses its own bundled Chromium (installed via playwright install chromium). This provides:

✅ Reproducibility: Known browser version across environments
✅ No system dependencies: Works in containers/CI without system Chrome
✅ Headless-first design: Optimized for automation

When to use system Chrome (--playwright-executable-path):

🎯 Testing against real Chrome (not Chromium)
🎯 Using Chrome extensions or enterprise policies
🎯 Matching end-user browser versions exactly
🎯 Debugging with Chrome DevTools locally

Trade-offs:

Aspect	Bundled Chromium	System Chrome
Setup	`playwright install chromium`	Install Chrome + ensure compatibility
Version control	Pinned to Playwright release	Depends on system updates
Size	~300MB download	Already on system
Reproducibility	✅ High (version-locked)	⚠️ Lower (varies by system)
Extensions	❌ Not supported	✅ Supported
DevTools	Limited	Full local debugging

Usage example with system Chrome:

screenshot local \
  --urls https://example.com \
  --output-dir ./screenshots \
  --playwright-executable-path /usr/bin/google-chrome-stable

Need a deeper comparison? Check the repository's .dev_docs/playwright_vs_selenium_linux.md for codec/DRM support, driver management, and when to switch to system Chrome.

Finding Chrome path:

# Linux/WSL
which google-chrome-stable    # Usually /usr/bin/google-chrome-stable
which chromium-browser         # Usually /usr/bin/chromium-browser

# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome

# Windows (WSL path)
/mnt/c/Program\ Files/Google/Chrome/Application/chrome.exe

If the path is invalid, the tool logs a warning and falls back to bundled Chromium automatically.

Installing System Chrome/Chromium

For Playwright (Optional - only if using system Chrome)

# Google Chrome (stable) - Linux/WSL
wget -O /tmp/chrome.deb https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo apt install -y /tmp/chrome.deb

# OR Chromium (from distro packages)
sudo apt-get update && sudo apt-get install -y chromium-browser fonts-liberation

For Selenium (Required)

Selenium always requires a system browser + matching chromedriver:

# Install Chrome (as above)
# Then install chromedriver
pip install webdriver-manager  # Auto-downloads matching chromedriver

Tools like webdriver-manager automatically download the chromedriver matching your installed Chrome version.

Configuration

Environment Variables

Runtime behavior can be customized via environment variables:

Variable	Description	Default
`SCREENSHOT_SCROLL_STEP_DELAY_MS`	Delay between scroll steps (ms)	`350`
`SCREENSHOT_MAX_SCROLL_STEPS`	Maximum scroll iterations	`40`
`PLAYWRIGHT_CAPTURE_MAX_ATTEMPTS`	Retry attempts for failed captures	`3`
`SCREENSHOT_RETRY_BACKOFF_S`	Initial retry delay (seconds)	`0.5`
`SCREENSHOT_RETRY_MAX_BACKOFF_S`	Maximum retry delay (seconds)	`5.0`

Example:

export SCREENSHOT_SCROLL_STEP_DELAY_MS=200
export PLAYWRIGHT_CAPTURE_MAX_ATTEMPTS=5
screenshot local --urls https://example.com --output-dir ./screenshots

Logging

infra-screenshot uses Python's standard logging module. Enable diagnostics in your application or CLI runs with:

import logging

logging.basicConfig(level=logging.INFO)
logging.getLogger("screenshot.playwright_runner").setLevel(logging.DEBUG)

Logger namespaces:

Logger	Purpose
`screenshot.playwright_runner`	Playwright capture + upload lifecycle
`screenshot.selenium_runner`	Selenium fallback pipeline
`screenshot.cli`	CLI orchestration and batch processing

Log records include structured extra={...} fields such as job_id, url, and viewport. URLs are sanitized before logging to prevent leaking SAS tokens or credentials; configure your formatter (JSON/text) to emit those keys for easier filtering.

OpenTelemetry correlation

When using OpenTelemetry, attach trace/span IDs to screenshot logs so traces and logs stay aligned:

import logging
from pathlib import Path

from opentelemetry import trace

from screenshot import capture_screenshots_async

tracer = trace.get_tracer(__name__)
url = "https://example.com/products"
job_id = "otel-demo"

with tracer.start_as_current_span("screenshot-job") as span:
    logger = logging.getLogger("screenshot.playwright_runner")
    logger.info(
        "Starting screenshot job",
        extra={
            "job_id": job_id,
            "trace_id": span.get_span_context().trace_id,
            "span_id": span.get_span_context().span_id,
        },
    )
    options = ...  # Build ScreenshotOptions as shown above
    await capture_screenshots_async(
        job_id,
        url,
        store_dir=Path("/tmp/screens"),
        partition_date=None,
        options=options,
    )

Contributing

We welcome contributions! To get started with development:

Read the contributing guide: CONTRIBUTING.md
Set up your development environment (covered in CONTRIBUTING.md)
Run tests and linters before submitting PRs

For bug reports and feature requests, please open an issue.

License

This project is dual-licensed:

AGPL-3.0 (Open Source)

Free for open-source and non-commercial use under the GNU Affero General Public License v3.0.

Key requirement: If you run this software as a service (SaaS, API, web app), you must make your complete source code available under AGPL-3.0.

Commercial License

For commercial use without AGPL obligations (proprietary products, SaaS without open-sourcing, etc.).

See LICENSE for full details.

Need help? Check out:

Documentation - Configuration reference and migration guides
Chromium Compatibility Levels - Understanding browser options

Release Readiness Highlights

uv run pytest tests -v --cov=screenshot --cov-report=term-missing currently reports 84 % coverage; the Playwright and Selenium runners have dedicated unit suites that exercise retry logic and helper utilities without hitting real browsers.
GitHub Actions enforces lint + test gates for every PR: mypy, ruff, pytest -m "not e2e", and pre-commit. The publish workflow also runs uv run twine check dist/* before uploading artifacts.
Full end-to-end tests live in tests/test_screenshot_service_e2e.py; run them locally with RUN_E2E=1 after installing Chrome/Selenium extras.
Test isolation is ensured via an autouse fixture in conftest.py that resets browser manager state between tests, preventing environment pollution from tests that modify browser paths or environment variables.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ms-pj

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.1

Mar 16, 2026

0.3.0

Mar 16, 2026

0.2.0

Mar 15, 2026

0.1.6

Jan 16, 2026

0.1.5

Jan 8, 2026

0.1.4

Dec 3, 2025

This version

0.1.3

Nov 24, 2025

0.1.2

Nov 19, 2025

0.1.1

Nov 18, 2025

0.1.0

Nov 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

infra_screenshot-0.1.3.tar.gz (82.1 kB view details)

Uploaded Nov 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

infra_screenshot-0.1.3-py3-none-any.whl (72.3 kB view details)

Uploaded Nov 24, 2025 Python 3

File details

Details for the file infra_screenshot-0.1.3.tar.gz.

File metadata

Download URL: infra_screenshot-0.1.3.tar.gz
Upload date: Nov 24, 2025
Size: 82.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for infra_screenshot-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`3f6073637bec832420753586530fa39dead1a2b046aaff8e36432350de1d6140`
MD5	`ed76154728a4a1aeb3ac69d1a864246a`
BLAKE2b-256	`c6925f63acd6f757e673e2704e8cc73e22b05feaf41a0829ff4e6cb8e8630b8f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for infra_screenshot-0.1.3.tar.gz:

Publisher: publish.yml on pj-ms/infra-screenshot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: infra_screenshot-0.1.3.tar.gz
- Subject digest: 3f6073637bec832420753586530fa39dead1a2b046aaff8e36432350de1d6140
- Sigstore transparency entry: 719603275
- Sigstore integration time: Nov 24, 2025
Source repository:
- Permalink: pj-ms/infra-screenshot@8a7b51c56c6ee1d0cf59db5efd2ecb8381e2af66
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/pj-ms
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8a7b51c56c6ee1d0cf59db5efd2ecb8381e2af66
- Trigger Event: push

File details

Details for the file infra_screenshot-0.1.3-py3-none-any.whl.

File metadata

Download URL: infra_screenshot-0.1.3-py3-none-any.whl
Upload date: Nov 24, 2025
Size: 72.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for infra_screenshot-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e3918cde02c04667563d42d633d342658e611cb330d030c9bab7c629c10f8853`
MD5	`a49dcfc0505a928440ecd968c3c147c9`
BLAKE2b-256	`050a444d537c2e45e2bc25c45a87308573c1f8a11ded9280f8515bcc4b7d5092`

See more details on using hashes here.

Provenance

The following attestation bundles were made for infra_screenshot-0.1.3-py3-none-any.whl:

Publisher: publish.yml on pj-ms/infra-screenshot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: infra_screenshot-0.1.3-py3-none-any.whl
- Subject digest: e3918cde02c04667563d42d633d342658e611cb330d030c9bab7c629c10f8853
- Sigstore transparency entry: 719603286
- Sigstore integration time: Nov 24, 2025
Source repository:
- Permalink: pj-ms/infra-screenshot@8a7b51c56c6ee1d0cf59db5efd2ecb8381e2af66
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/pj-ms
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8a7b51c56c6ee1d0cf59db5efd2ecb8381e2af66
- Trigger Event: push

infra-screenshot 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

infra-screenshot

Table of Contents

Features

Installation

Using uv (Recommended)

Using pip

Quick Verification

Usage

CLI: Local Screenshot Capture

Basic Examples

Input File Format (JSONL)

Common Options

Real-World Examples

Python API: Programmatic Usage

Architecture Refresh

Mutability Notes

Browser Setup

Playwright: Bundled Chromium vs System Chrome

Installing System Chrome/Chromium

For Playwright (Optional - only if using system Chrome)

For Selenium (Required)

Configuration

Environment Variables

Logging

OpenTelemetry correlation

Contributing

License

AGPL-3.0 (Open Source)

Commercial License

Release Readiness Highlights

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance