Skip to main content

Headless chrome/chromium automation library (unofficial port of puppeteer)

Project description

Wrk Fork

This repo have been forked to solve an issue with PantsBuild and collisioning license file. The license file have been renamed LICENSE_PYPPETEER to avoid collision.

Attention: This repo is unmaintained and has been outside of minor changes for a long time. Please consider playwright-python as an alternative.

If you would like to overhaul this code to bring it up to date, please contact me

pyppeteer

PyPI PyPI version Documentation CircleCI codecov

Note: this is a continuation of the pyppeteer project. Before undertaking any sort of developement, it is highly recommended that you take a look at #16 for the ongoing effort to update this library to avoid duplicating efforts.

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.

Installation

pyppeteer requires Python >= 3.6

Install with pip from PyPI:

pip install wrk-pyppeteer

Or install the latest version from this github repo:

pip install -U git+https://github.com/pyppeteer/pyppeteer@dev

Usage

Note: When you run pyppeteer for the first time, it downloads the latest version of Chromium (~150MB) if it is not found on your system. If you don't prefer this behavior, ensure that a suitable Chrome binary is installed. One way to do this is to run pyppeteer-install command before prior to using this library.

Full documentation can be found here. Puppeteer's documentation and its troubleshooting guide are also great resources for pyppeteer users.

Examples

Open web page and take a screenshot:

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('https://example.com')
    await page.screenshot({'path': 'example.png'})
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Evaluate javascript on a page:

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('https://example.com')
    await page.screenshot({'path': 'example.png'})

    dimensions = await page.evaluate('''() => {
        return {
            width: document.documentElement.clientWidth,
            height: document.documentElement.clientHeight,
            deviceScaleFactor: window.devicePixelRatio,
        }
    }''')

    print(dimensions)
    # >>> {'width': 800, 'height': 600, 'deviceScaleFactor': 1}
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Differences between puppeteer and pyppeteer

pyppeteer strives to replicate the puppeteer API as close as possible, however, fundamental differences between Javascript and Python make this difficult to do precisely. More information on specifics can be found in the documentation.

Keyword arguments for options

puppeteer uses an object for passing options to functions/methods. pyppeteer methods/functions accept both dictionary (python equivalent to JavaScript's objects) and keyword arguments for options.

Dictionary style options (similar to puppeteer):

browser = await launch({'headless': True})

Keyword argument style options (more pythonic, isn't it?):

browser = await launch(headless=True)

Element selector method names

In python, $ is not a valid identifier. The equivalent methods to Puppeteer's $, $$, and $x methods are listed below, along with some shorthand methods for your convenience:

puppeteer pyppeteer pyppeteer shorthand
Page.$() Page.querySelector() Page.J()
Page.$$() Page.querySelectorAll() Page.JJ()
Page.$x() Page.xpath() Page.Jx()

Arguments of Page.evaluate() and Page.querySelectorEval()

puppeteer's version of evaluate() takes a JavaScript function or a string representation of a JavaScript expression. pyppeteer takes string representation of JavaScript expression or function. pyppeteer will try to automatically detect if the string is function or expression, but it will fail sometimes. If an expression is erroneously treated as function and an error is raised, try setting force_expr to True, to force pyppeteer to treat the string as expression.

Examples:

Get a page's textContent:

content = await page.evaluate('document.body.textContent', force_expr=True)

Get an element's textContent:

element = await page.querySelector('h1')
title = await page.evaluate('(element) => element.textContent', element)

Roadmap

See projects

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wrk_pyppeteer-1.0.2.1.tar.gz (71.6 kB view details)

Uploaded Source

Built Distribution

wrk_pyppeteer-1.0.2.1-py3-none-any.whl (79.3 kB view details)

Uploaded Python 3

File details

Details for the file wrk_pyppeteer-1.0.2.1.tar.gz.

File metadata

  • Download URL: wrk_pyppeteer-1.0.2.1.tar.gz
  • Upload date:
  • Size: 71.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.9.7 Darwin/23.1.0

File hashes

Hashes for wrk_pyppeteer-1.0.2.1.tar.gz
Algorithm Hash digest
SHA256 525c678881b8a2a89b780a4b4e73ea46783510074ec75017c321384c0230e7f4
MD5 a3d94cfaa94f09d9e05a766b929a6028
BLAKE2b-256 d47876cbb3152d367101518f8d5effe8a643ab89eab263b755b2c347475ef155

See more details on using hashes here.

File details

Details for the file wrk_pyppeteer-1.0.2.1-py3-none-any.whl.

File metadata

  • Download URL: wrk_pyppeteer-1.0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 79.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.9.7 Darwin/23.1.0

File hashes

Hashes for wrk_pyppeteer-1.0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7cd9aa4ddc13300f56ec60daec4e2146cccf218575f5de63c01bfb5111890e53
MD5 91730cb3414777bb6f8de5b6a1f66f69
BLAKE2b-256 2da69cdc2de29cdfc25640aa51f9a6f2526450f7d4029910fe429aaa0ede2217

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page