Browser automation for scholarly paper access in the SciTeX ecosystem
Project description
scitex-browser
Playwright wrappers for scholarly paper access — popup handling, PDF capture, failure-replay screenshots, stealth browsing.
Full Documentation · uv pip install scitex-browser[all]
Problem and Solution
| # | Problem | Solution |
|---|---|---|
| 1 | Playwright is great but verbose -- every scraping script reinvents popup dismissal, retry logic, Chrome-PDF-viewer workaround | Helpers: click_with_fallbacks_async([sel1, sel2]), save_as_pdf_async, close_popups_async, inject_visual_effects — focused wrappers around Playwright |
| 2 | Tests fail silently with no artifact -- pytest-playwright doesn't auto-capture screen + DOM on failure |
TestMonitor + create_failure_capture_fixture -- captures screenshot + page HTML + console log on every failure |
Features
- Debugging: Visual cursor feedback, popup logging, failure capture, test monitoring
- PDF: Chrome PDF viewer detection, save-as-PDF automation
- Interaction: Click/fill with fallbacks, popup handling
- Stealth: Human-like behavior simulation, stealth browser management
- Remote: ZenRows API integration, CAPTCHA handling
- Collaboration: Shared browser sessions, credential management
- Auth: Google authentication helpers
Installation
pip install scitex-browser
Architecture
scitex-browser/
├── src/scitex_browser/
│ ├── __init__.py # save_as_pdf, click_with_fallbacks_async, ...
│ ├── debugging/ # TestMonitor, capture_debug_artifacts_async
│ │ ├── _capture.py # screenshot + HTML + console artifacts
│ │ └── _monitor.py # pytest-playwright failure hook
│ ├── stealth/ # StealthManager + playwright-stealth glue
│ ├── remote/ # ZenRows API + CAPTCHA handling
│ ├── auth/ # Google + shared-session helpers
│ └── pdf/ # Chrome-PDF-viewer detection, save_as_pdf
└── tests/ # pytest-playwright suite
Optional extras
pip install scitex-browser[stealth] # playwright-stealth
pip install scitex-browser[remote] # ZenRows integration
pip install scitex-browser[scitex] # Full SciTeX integration
Quick start
from scitex_browser import save_as_pdf, browser_logger
from scitex_browser.stealth import StealthManager
1 Interfaces
Python API
from scitex_browser import (
save_as_pdf, save_as_pdf_async,
click_with_fallbacks_async, close_popups_async,
inject_visual_effects, browser_logger,
)
from scitex_browser.stealth import StealthManager
from scitex_browser.debugging import (
TestMonitor, create_failure_capture_fixture,
capture_debug_artifacts_async, # screenshot + HTML in one call
)
click_with_fallbacks_async and fill_with_fallbacks_async capture
screenshot + HTML before/after every call by default
(capture_debug=True). Drop capture_debug=False only in tight
loops. See _skills/scitex-browser/11_debugging-visuals.md for the
full pattern.
Demo
sequenceDiagram
participant T as pytest test
participant H as click_with_fallbacks_async
participant P as Playwright Page
participant C as capture_debug_artifacts_async
T->>H: click(["#accept", ".cookie-ok"])
H->>P: try selector 1
P-->>H: not found
H->>P: try selector 2 -> click
H->>C: snapshot before/after
C-->>T: screenshot.png + page.html + console.log
Part of SciTeX
scitex-browser is part of SciTeX. Install via
the umbrella with pip install scitex[browser] to use as
scitex.browser (Python) or scitex browser ... (CLI).
Four Freedoms for Research
- The freedom to run your research anywhere — your machine, your terms.
- The freedom to study how every step works — from raw data to final manuscript.
- The freedom to redistribute your workflows, not just your papers.
- The freedom to modify any module and share improvements with the community.
AGPL-3.0 — because we believe research infrastructure deserves the same freedoms as the software it runs on.
License
AGPL-3.0. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scitex_browser-0.1.15.tar.gz.
File metadata
- Download URL: scitex_browser-0.1.15.tar.gz
- Upload date:
- Size: 9.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90c1ac6fb087050f743fbd8e8f0aa670392a5894fc9ab3a4c48dbd53029052cf
|
|
| MD5 |
1ecf18da2814643c556a83351788d34a
|
|
| BLAKE2b-256 |
047d7b9a440143e42507f5ff1c079b774f8503c90c9229678a15c6086944273b
|
Provenance
The following attestation bundles were made for scitex_browser-0.1.15.tar.gz:
Publisher:
pypi-publish-and-github-release-on-tag.yml on ywatanabe1989/scitex-browser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scitex_browser-0.1.15.tar.gz -
Subject digest:
90c1ac6fb087050f743fbd8e8f0aa670392a5894fc9ab3a4c48dbd53029052cf - Sigstore transparency entry: 1641092284
- Sigstore integration time:
-
Permalink:
ywatanabe1989/scitex-browser@26e3727f441de6ad51516a4f5181e9e9e90d74ef -
Branch / Tag:
refs/tags/v0.1.15 - Owner: https://github.com/ywatanabe1989
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish-and-github-release-on-tag.yml@26e3727f441de6ad51516a4f5181e9e9e90d74ef -
Trigger Event:
push
-
Statement type:
File details
Details for the file scitex_browser-0.1.15-py3-none-any.whl.
File metadata
- Download URL: scitex_browser-0.1.15-py3-none-any.whl
- Upload date:
- Size: 9.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9dc3bbb178f95eff0857c2996f4ad244bbef582def542f3d1ce23ae3fca9f084
|
|
| MD5 |
d55ae9481d6473a14587d35c2deea4a3
|
|
| BLAKE2b-256 |
230e7247b323fc1a0b5914d4bb7f3afb04d5399ce9159d4ccb327e23f583af24
|
Provenance
The following attestation bundles were made for scitex_browser-0.1.15-py3-none-any.whl:
Publisher:
pypi-publish-and-github-release-on-tag.yml on ywatanabe1989/scitex-browser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scitex_browser-0.1.15-py3-none-any.whl -
Subject digest:
9dc3bbb178f95eff0857c2996f4ad244bbef582def542f3d1ce23ae3fca9f084 - Sigstore transparency entry: 1641092507
- Sigstore integration time:
-
Permalink:
ywatanabe1989/scitex-browser@26e3727f441de6ad51516a4f5181e9e9e90d74ef -
Branch / Tag:
refs/tags/v0.1.15 - Owner: https://github.com/ywatanabe1989
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish-and-github-release-on-tag.yml@26e3727f441de6ad51516a4f5181e9e9e90d74ef -
Trigger Event:
push
-
Statement type: