Skip to main content

A simple library to capture websites using playwright

Project description

Playwright Capture

Simple replacement for splash using playwright.

Install

pip install playwrightcapture

Note for Ubuntu 26.04 pre-1.61.0

It is not supported, and playwright install fails. A quick and dirty fix is

PLAYWRIGHT_HOST_PLATFORM_OVERRIDE=ubuntu24.04-x64 playwright install

Usage

A very basic example:

from playwrightcapture import Capture

async with Capture() as capture:
    await capture.initialize_context()
    entries = await capture.capture_page(url, max_depth_capture_time=90)

Entries is a dictionaries that contains (if all goes well) the HAR, the screenshot, all the cookies of the session, the URL as it is in the browser at the end of the capture, and the full HTML page as rendered.

reCAPTCHA bypass

No blackmagic, it is just a reimplementation of a well known technique as implemented there, and there.

This modules will try to bypass reCAPTCHA protected websites if you install it this way:

pip install playwrightcapture[recaptcha]

This will install requests, pydub and SpeechRecognition. In order to work, pydub requires ffmpeg or libav, look at the install guide for more details. SpeechRecognition uses the Google Speech Recognition API to turn the audio file into text (I hope you appreciate the irony).

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

playwrightcapture-1.39.11.tar.gz (31.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

playwrightcapture-1.39.11-py3-none-any.whl (33.3 kB view details)

Uploaded Python 3

File details

Details for the file playwrightcapture-1.39.11.tar.gz.

File metadata

  • Download URL: playwrightcapture-1.39.11.tar.gz
  • Upload date:
  • Size: 31.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for playwrightcapture-1.39.11.tar.gz
Algorithm Hash digest
SHA256 b72cb08944b8fc908ba683eaee6cb4f1b64c7e064409b529c0c6b14f44c15b9c
MD5 3cca2849596fb04a40f2c261b631b7c2
BLAKE2b-256 9336ecedde22a7a05f935516ca38f99a166f16a582f7b980c8e0fac9704fbc53

See more details on using hashes here.

Provenance

The following attestation bundles were made for playwrightcapture-1.39.11.tar.gz:

Publisher: release.yml on Lookyloo/PlaywrightCapture

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file playwrightcapture-1.39.11-py3-none-any.whl.

File metadata

File hashes

Hashes for playwrightcapture-1.39.11-py3-none-any.whl
Algorithm Hash digest
SHA256 e76fbf7cbf83a791f6ab84879a6a5da85dcccebe1c7c9732174cc3c71508c4d0
MD5 5c23f4b08a914ac5b3ded3da2f974f3f
BLAKE2b-256 9911cde0cca65428e1bc494a204dacf069a18e3f91141adc44e4c0914f81aee4

See more details on using hashes here.

Provenance

The following attestation bundles were made for playwrightcapture-1.39.11-py3-none-any.whl:

Publisher: release.yml on Lookyloo/PlaywrightCapture

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page