A library to prevent detections caused by Runtime.enable.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

PyppeteerProtect

PyppeterProtect is an implementation of rebrowser-patches, in pyppeteer, with the notable difference of not requiring you to modify your installation of pyppeteer for it to work. You simply call PyppeteerProtect on a target page and the patches get applied automatically.

PyppeteerProtect (at the moment) doesn't provide protection for running in headless mode, besides a simple set of the useragent to remove HeadlessChrome. For this you should look into finding an additional library that you can run over PyppeteerProtect that can offer such protections, like pyppeteer_stealth, for example (though this specifically, only makes you more detectable to the major anti-bot solutions).

Install

$ pip install PyppeteerProtect

Usage

Import the library:

from PyppeteerProtect import PyppeteerProtect, SetSecureArguments;

Set default arguments for the chrome executable that help stay protected (sets --disable-blink-features=AutomationControlled and removes --enable-automation)

SetSecureArguments(); # should be called before pyppeteer.launch

Protect individual pages:

pageProtect = await PyppeteerProtect(page);

Switch between using the main and an isolated execution context:

await pageProtect.useMainWorld();
await pageProtect.useIsolatedWorld();

You are freely able to swap between each of the contexts during an active session. As an example, you might want to do something like this:

await pageProtect.useIsolatedWorld();
token = await page.evaluate("() => document.querySelector('input[type=\'hidden\']#embedded-token')"); # document.querySelector might have been hooked in the main world to block queries for #embedded-token
await pageProtect.useMainWorld();
data = await page.evaluate("(token) => window.get_some_data(token)", token);

By default, PyppeteerProtect will use the execution context id of an isolated world. This is ideal for ensuring maximum security, as you don't have to worry about calling hooked global functions or accidentally leaking your pressence through global variables, however, it makes the code of the target page inaccessible.

If you plan on using the main world execution context and nothing else, you can configure the PyppeteerProtect constructor to use it on creation like so:

pageProtect = await PyppeteerProtect(page, True);

If you have a particularly special use case and are having issues with automatically obtaining an execution context id, you can use PyppeteerProtect to wait until one is obtained (though if you stick to basic Page.evaluate calls, this isn't something you should be worried about, as it gets called automatically)

await pageProtect.waitForExecutionContext();

Example

import asyncio;

from pyppeteer import launch;
from PyppeteerProtect import PyppeteerProtect, SetSecureArguments;

SetSecureArguments(); # set --disable-blink-features=AutomationControlled and remove --enable-automation

loop = asyncio.new_event_loop();
async def main():
    browser = await launch(
        executablePath = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe",
        headless = False, # currently no protection for running headless
        defaultViewport = {"width": 1920, "height": 953},
        loop = loop
    );

    page = (await browser.pages())[0];
    pageProtect = await PyppeteerProtect(page);
	
    await page.goto("https://www.datadome.co");
    print(await page.evaluate("()=>'Test Output'"));

    await asyncio.sleep(5000);
    await browser.close();

loop.run_until_complete(main());

How does it works?

PyppeteerProtect works by calling Runtime.disable and hooking CDPSession.send to drop any Runtime.enable requests sent by the pyppeteer library. Runtime.enable is used to retrieve an execution context id, which is required for functions such as Page.evaluate and Page.querySelectorAll to work, but in doing so, it enables the scripts running on the target page to observe behavior that would indicate the browser is being controlled by automation software, like pyppeteer/puppeteer.

PyppeteerProtect retrieves an execution context either by calling out to a binding (created with Runtime.addBinding and Runtime.bindingCalled, and called using Page.addScriptToEvaluateOnNewDocument and Runtime.evaluate in an isolated context), or by creating an isolated world (using Page.createIsolatedWorld).

These patches are applied automatically on each navigation by listening to the request and response events of the page, and by hooking ExecutionContext.evaluateHandle.

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.0.1

Nov 26, 2024

1.0.0

Nov 25, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyppeteerprotect-1.0.1.tar.gz (5.7 kB view details)

Uploaded Nov 26, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

PyppeteerProtect-1.0.1-py3-none-any.whl (6.2 kB view details)

Uploaded Nov 26, 2024 Python 3

File details

Details for the file pyppeteerprotect-1.0.1.tar.gz.

File metadata

Download URL: pyppeteerprotect-1.0.1.tar.gz
Upload date: Nov 26, 2024
Size: 5.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.13

File hashes

Hashes for pyppeteerprotect-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`8530eacf71396a42ced6a63bafb9f90fff93fb9b43c4b297f2e58feb086228e3`
MD5	`ecddfb2a6448d6013d87d61813468f30`
BLAKE2b-256	`04cd553ed02ce4c4902a67f806b98e9f53927ed3bade4a1355d4adc14e402e39`

See more details on using hashes here.

File details

Details for the file PyppeteerProtect-1.0.1-py3-none-any.whl.

File metadata

Download URL: PyppeteerProtect-1.0.1-py3-none-any.whl
Upload date: Nov 26, 2024
Size: 6.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.13

File hashes

Hashes for PyppeteerProtect-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fb9788eab1cedd1d4a0a004510554704302c7daed23e7f57d2a2d39161658402`
MD5	`7bc1fb2c5a1fc9c208942743583eb4c3`
BLAKE2b-256	`a150032204d404f7e2a398c2344f01b9bae4cba89440fab435a8db613c80781f`

See more details on using hashes here.

PyppeteerProtect 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PyppeteerProtect

Install

Usage

Example

How does it works?

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes