Skip to main content

jsEasy (jseasy): a lightweight Python scraping runtime that executes JavaScript against a small DOM without launching a browser.

Project description

jsEasy

jsEasy (jseasy) is a lightweight Python runtime for scraping and testing HTML pages that need JavaScript execution, but not a full browser.

It gives Python code a browser-like DOM, executes page scripts with QuickJS, supports common network APIs such as fetch() and XMLHttpRequest, and returns the final DOM for extraction.

requests + BeautifulSoup
< jsEasy
< Playwright / Selenium / real browsers

jsEasy is designed for the middle ground: pages where JavaScript mutates the DOM, loads JSON, runs timers, or executes small modules, but where layout, pixels, GPU APIs, and browser fingerprint parity are unnecessary.

Highlights

  • No browser process: no Chromium download, no WebDriver, no browser startup cost.
  • Python-first API: Page.open(), Page.from_html(), select(), select_all(), eval(), html().
  • Browser-like runtime: DOM, events, timers, storage, history, location, CSSOM, fetch, XHR, module scripts.
  • Scraper-friendly behavior: failed third-party scripts are collected in diagnostics instead of aborting the whole page.
  • Typed package: ships py.typed.
  • PyPI-ready: wheel/sdist build, docs, examples, tests, and release checklist.

Installation

pip install jseasy

Python 3.10+ is supported.

Quick Start

from jseasy import Page

page = Page.from_html("""
<main id="app"></main>
<script>
  const title = document.createElement("h1");
  title.textContent = "Loaded without Chrome";
  document.querySelector("#app").appendChild(title);
</script>
""")

print(page.select("#app h1").text)

Output:

Loaded without Chrome

Loading A Real Page

from jseasy import Page

with Page.open("https://example.com") as page:
    print(page.select("h1").text)
    print(len(page.html()))

Page.open() fetches the URL, parses HTML, loads stylesheets, executes scripts, drains pending work, and gives you the current DOM.

Fetch And XHR

import httpx
from jseasy import Page

def handler(request: httpx.Request) -> httpx.Response:
    return httpx.Response(200, json={"name": "Ada"})

client = httpx.Client(transport=httpx.MockTransport(handler))

page = Page.from_html(
    """
    <div id="name"></div>
    <script>
      fetch("/api")
        .then((response) => response.json())
        .then((data) => {
          document.querySelector("#name").textContent = data.name;
        });
    </script>
    """,
    url="https://example.test",
    client=client,
)

print(page.select("#name").text)

Module Scripts

import httpx
from jseasy import Page

def handler(request: httpx.Request) -> httpx.Response:
    if request.url.path == "/app.js":
        return httpx.Response(
            200,
            text="""
            import { label } from "./labels.js";
            document.querySelector("#app").textContent = label;
            """,
        )
    return httpx.Response(200, text='export const label = "module loaded";')

client = httpx.Client(transport=httpx.MockTransport(handler))

page = Page.from_html(
    '<div id="app"></div><script type="module" src="/app.js"></script>',
    url="https://example.test",
    client=client,
)

print(page.select("#app").text)

Diagnostics

jsEasy is intentionally tolerant by default. A tracking script, analytics widget, or unsupported browser feature should not necessarily prevent scraping the rest of the DOM.

with Page.open("https://example.com") as page:
    print(page.logs)            # console.log/warn/error output
    print(page.script_errors)   # script exceptions collected during load
    print(page.resource_errors) # stylesheet/resource failures

Set raise_script_errors=True during development when you want the first script failure to raise immediately.

page = Page.from_html(html, raise_script_errors=True)

Browser API Coverage

jsEasy implements a pragmatic subset of browser APIs:

Area Supported
DOM Document, Element, Node, Text, DocumentFragment
Selection querySelector, querySelectorAll, matches, closest
Mutation appendChild, removeChild, insertBefore, innerHTML, textContent, basic MutationObserver
Events Event, CustomEvent, MouseEvent, KeyboardEvent, addEventListener, dispatchEvent
Runtime setTimeout, setInterval, requestAnimationFrame, Promise draining, performance.now
Network fetch, XMLHttpRequest, Request, Response, Headers, navigator.sendBeacon
State localStorage, sessionStorage, document.cookie, history, location
CSSOM document.styleSheets, CSSStyleSheet, CSSStyleRule, CSSStyleDeclaration, getComputedStyle
Modules classic scripts, type="module", simple static local imports
Utilities atob, btoa, console

See docs/api.md for details.

When To Use jsEasy

Use jsEasy when:

  • you need DOM extraction after simple or moderate JavaScript execution;
  • content is loaded via fetch() or XHR;
  • scripts manipulate the DOM but do not require layout;
  • you want fast startup in CLI jobs, tests, CI, workers, or small containers;
  • Playwright/Selenium feels too heavy for the target page.

Use a real browser when:

  • the site depends on layout metrics, canvas, WebGL, media, Shadow DOM, or complex framework hydration;
  • the target is fingerprint-sensitive;
  • you need browser DevTools, screenshots, or user interaction fidelity;
  • the page presents CAPTCHAs, payment walls, login walls, or other access controls.

Security And Ethics

jsEasy is intended for legitimate scraping, testing, research, and data extraction where you are allowed to access the content. Respect site terms, robots policies, rate limits, copyright, privacy, and access controls.

Do not use jsEasy to bypass CAPTCHAs, authentication, payment walls, or other security mechanisms. For security challenge pages, use a human-in-the-loop checkpoint.

Examples

See examples/:

Documentation

Development

python -m venv .venv
. .venv/bin/activate
pip install -e ".[dev]"
pytest
python -m build
twine check dist/*

Status

Alpha. jsEasy is useful today for small to medium JavaScript-enhanced scraping workflows, but the API may change before 1.0.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jseasy-0.1.0.tar.gz (24.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jseasy-0.1.0-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file jseasy-0.1.0.tar.gz.

File metadata

  • Download URL: jseasy-0.1.0.tar.gz
  • Upload date:
  • Size: 24.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for jseasy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b14f0d7369c586e92dde768675d8a63d29fdf9641850185d8947d51fc25fa0dd
MD5 57815c9d7aec68f255ed434aca38ae6d
BLAKE2b-256 6891a0877760bdf2738b59b3825750e4bef0f84280533682d576b85f21c30428

See more details on using hashes here.

File details

Details for the file jseasy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: jseasy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for jseasy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2cb6d5293e6ce6fc5fcf1e0a00ebe8723cca30066bb665e678b3fbc104021f9c
MD5 716521bec7e9bc2a0d534f14686bf9c6
BLAKE2b-256 ce1a4e2f6aadc5d179b0db7efc70aee91d3ece6f6463faabec8d4fc362fbb2f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page