Skip to main content

Headless chrome/chromium automation library (unofficial port of puppeteer)

Project description

pyppeteer

PyPI PyPI version Documentation CircleCI codecov

Note: this is a continuation of the pyppeteer project

Unofficial Python port of the dev branch of pyppeteer which is an unofficial port of puppeteer JavaScript (headless) chrome/chromium browser automation library :)

Maintained by changedetection.io

This port is maintained by the google people at changedetectio.io - the number one solution for web page change detection and notification

The original repository seems to be unmaintained.

Installation

pyppeteer requires Python >= 3.8

Install with pip from PyPI:

pip install pyppeteer-ng

Or install the latest version from this github repo:

pip install -U git+https://github.com/dgtlmoon/pyppeteer-ng@dev

Usage

Note: When you run pyppeteer for the first time, it downloads the latest version of Chromium (~150MB) if it is not found on your system. If you don't prefer this behavior, ensure that a suitable Chrome binary is installed. One way to do this is to run pyppeteer-install command before prior to using this library.

Full documentation can be found here. Puppeteer's documentation and its troubleshooting guide are also great resources for pyppeteer users.

Examples

Open web page and take a screenshot:

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('https://example.com')
    await page.screenshot({'path': 'example.png'})
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Evaluate javascript on a page:

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('https://example.com')
    await page.screenshot({'path': 'example.png'})

    dimensions = await page.evaluate('''() => {
        return {
            width: document.documentElement.clientWidth,
            height: document.documentElement.clientHeight,
            deviceScaleFactor: window.devicePixelRatio,
        }
    }''')

    print(dimensions)
    # >>> {'width': 800, 'height': 600, 'deviceScaleFactor': 1}
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Differences between puppeteer and pyppeteer

pyppeteer strives to replicate the puppeteer API as close as possible, however, fundamental differences between Javascript and Python make this difficult to do precisely. More information on specifics can be found in the documentation.

Keyword arguments for options

puppeteer uses an object for passing options to functions/methods. pyppeteer methods/functions accept both dictionary (python equivalent to JavaScript's objects) and keyword arguments for options.

Dictionary style options (similar to puppeteer):

browser = await launch({'headless': True})

Keyword argument style options (more pythonic, isn't it?):

browser = await launch(headless=True)

Element selector method names

In python, $ is not a valid identifier. The equivalent methods to Puppeteer's $, $$, and $x methods are listed below, along with some shorthand methods for your convenience:

puppeteer pyppeteer pyppeteer shorthand
Page.$() Page.querySelector() Page.J()
Page.$$() Page.querySelectorAll() Page.JJ()
Page.$x() Page.xpath() Page.Jx()

Arguments of Page.evaluate() and Page.querySelectorEval()

puppeteer's version of evaluate() takes a JavaScript function or a string representation of a JavaScript expression. pyppeteer takes string representation of JavaScript expression or function. pyppeteer will try to automatically detect if the string is function or expression, but it will fail sometimes. If an expression is erroneously treated as function and an error is raised, try setting force_expr to True, to force pyppeteer to treat the string as expression.

Examples:

Get a page's textContent:

content = await page.evaluate('document.body.textContent', force_expr=True)

Get an element's textContent:

element = await page.querySelector('h1')
title = await page.evaluate('(element) => element.textContent', element)

Roadmap

See projects

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyppeteer_ng-2.0.0rc13.tar.gz (186.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyppeteer_ng-2.0.0rc13-py3-none-any.whl (203.3 kB view details)

Uploaded Python 3

File details

Details for the file pyppeteer_ng-2.0.0rc13.tar.gz.

File metadata

  • Download URL: pyppeteer_ng-2.0.0rc13.tar.gz
  • Upload date:
  • Size: 186.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyppeteer_ng-2.0.0rc13.tar.gz
Algorithm Hash digest
SHA256 59f6ce1745626fe916f81daed9a5c679360d8f52cea2863724f0d9a96300dc2f
MD5 d201ce9d4ef7b079877c38426935248d
BLAKE2b-256 5ca321502a32d2060fbf2c38b7d1f8d60a645d70a82c88ddc86aa785afee5a3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyppeteer_ng-2.0.0rc13.tar.gz:

Publisher: pypi-release.yml on dgtlmoon/pyppeteer-ng

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyppeteer_ng-2.0.0rc13-py3-none-any.whl.

File metadata

File hashes

Hashes for pyppeteer_ng-2.0.0rc13-py3-none-any.whl
Algorithm Hash digest
SHA256 1e54ae49d2c305743acd6dc5461f4d6151b61b42657c1f261f524e1ebf0f1efe
MD5 1e0bc3beb9cb87af7c7301b592846c79
BLAKE2b-256 dbb007d0bab22d6dc46e7a09bde5d0b79f11b39e89ced540ff6078d7a2627595

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyppeteer_ng-2.0.0rc13-py3-none-any.whl:

Publisher: pypi-release.yml on dgtlmoon/pyppeteer-ng

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page