Skip to main content

Browser automation for scholarly paper access in the SciTeX ecosystem

Project description

scitex-browser

SciTeX

Playwright wrappers for scholarly paper access — popup handling, PDF capture, failure-replay screenshots, stealth browsing.

Full Documentation · uv pip install scitex-browser[all]

PyPI Python Tests Coverage Docs License: AGPL v3


Problem and Solution

# Problem Solution
1 Playwright is great but verbose -- every scraping script reinvents popup dismissal, retry logic, Chrome-PDF-viewer workaround Helpers: click_with_fallbacks_async([sel1, sel2]), save_as_pdf_async, close_popups_async, inject_visual_effects — focused wrappers around Playwright
2 Tests fail silently with no artifact -- pytest-playwright doesn't auto-capture screen + DOM on failure TestMonitor + create_failure_capture_fixture -- captures screenshot + page HTML + console log on every failure

Features

  • Debugging: Visual cursor feedback, popup logging, failure capture, test monitoring
  • PDF: Chrome PDF viewer detection, save-as-PDF automation
  • Interaction: Click/fill with fallbacks, popup handling
  • Stealth: Human-like behavior simulation, stealth browser management
  • Remote: ZenRows API integration, CAPTCHA handling
  • Collaboration: Shared browser sessions, credential management
  • Auth: Google authentication helpers

Installation

pip install scitex-browser

Architecture

scitex-browser/
├── src/scitex_browser/
│   ├── __init__.py              # save_as_pdf, click_with_fallbacks_async, ...
│   ├── debugging/               # TestMonitor, capture_debug_artifacts_async
│   │   ├── _capture.py          # screenshot + HTML + console artifacts
│   │   └── _monitor.py          # pytest-playwright failure hook
│   ├── stealth/                 # StealthManager + playwright-stealth glue
│   ├── remote/                  # ZenRows API + CAPTCHA handling
│   ├── auth/                    # Google + shared-session helpers
│   └── pdf/                     # Chrome-PDF-viewer detection, save_as_pdf
└── tests/                       # pytest-playwright suite

Optional extras

pip install scitex-browser[stealth]   # playwright-stealth
pip install scitex-browser[remote]    # ZenRows integration
pip install scitex-browser[scitex]    # Full SciTeX integration

Quick start

from scitex_browser import save_as_pdf, browser_logger
from scitex_browser.stealth import StealthManager

1 Interfaces

Python API
from scitex_browser import (
    save_as_pdf, save_as_pdf_async,
    click_with_fallbacks_async, close_popups_async,
    inject_visual_effects, browser_logger,
)
from scitex_browser.stealth import StealthManager
from scitex_browser.debugging import (
    TestMonitor, create_failure_capture_fixture,
    capture_debug_artifacts_async,        # screenshot + HTML in one call
)

click_with_fallbacks_async and fill_with_fallbacks_async capture screenshot + HTML before/after every call by default (capture_debug=True). Drop capture_debug=False only in tight loops. See _skills/scitex-browser/11_debugging-visuals.md for the full pattern.

Demo

sequenceDiagram
    participant T as pytest test
    participant H as click_with_fallbacks_async
    participant P as Playwright Page
    participant C as capture_debug_artifacts_async
    T->>H: click(["#accept", ".cookie-ok"])
    H->>P: try selector 1
    P-->>H: not found
    H->>P: try selector 2 -> click
    H->>C: snapshot before/after
    C-->>T: screenshot.png + page.html + console.log

Part of SciTeX

scitex-browser is part of SciTeX. Install via the umbrella with pip install scitex[browser] to use as scitex.browser (Python) or scitex browser ... (CLI).

Four Freedoms for Research

  1. The freedom to run your research anywhere — your machine, your terms.
  2. The freedom to study how every step works — from raw data to final manuscript.
  3. The freedom to redistribute your workflows, not just your papers.
  4. The freedom to modify any module and share improvements with the community.

AGPL-3.0 — because we believe research infrastructure deserves the same freedoms as the software it runs on.

License

AGPL-3.0. See LICENSE for details.


SciTeX

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scitex_browser-0.1.15.tar.gz (9.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scitex_browser-0.1.15-py3-none-any.whl (9.1 MB view details)

Uploaded Python 3

File details

Details for the file scitex_browser-0.1.15.tar.gz.

File metadata

  • Download URL: scitex_browser-0.1.15.tar.gz
  • Upload date:
  • Size: 9.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scitex_browser-0.1.15.tar.gz
Algorithm Hash digest
SHA256 90c1ac6fb087050f743fbd8e8f0aa670392a5894fc9ab3a4c48dbd53029052cf
MD5 1ecf18da2814643c556a83351788d34a
BLAKE2b-256 047d7b9a440143e42507f5ff1c079b774f8503c90c9229678a15c6086944273b

See more details on using hashes here.

Provenance

The following attestation bundles were made for scitex_browser-0.1.15.tar.gz:

Publisher: pypi-publish-and-github-release-on-tag.yml on ywatanabe1989/scitex-browser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scitex_browser-0.1.15-py3-none-any.whl.

File metadata

File hashes

Hashes for scitex_browser-0.1.15-py3-none-any.whl
Algorithm Hash digest
SHA256 9dc3bbb178f95eff0857c2996f4ad244bbef582def542f3d1ce23ae3fca9f084
MD5 d55ae9481d6473a14587d35c2deea4a3
BLAKE2b-256 230e7247b323fc1a0b5914d4bb7f3afb04d5399ce9159d4ccb327e23f583af24

See more details on using hashes here.

Provenance

The following attestation bundles were made for scitex_browser-0.1.15-py3-none-any.whl:

Publisher: pypi-publish-and-github-release-on-tag.yml on ywatanabe1989/scitex-browser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page