A simple and light Playwright-based scraper

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

elecbrandy

These details have not been verified by PyPI

Project description

pw-simple-scraper

A lightweight, easy-to-use web scraper built with Python and Playwright

한국어 보러가기

Overview

pw-simple-scraper scrapes desired elements from a web page.
Provide a URL + CSS selector, and it will return the matching elements as a list of strings.
The result is wrapped in a ScrapeResult object. You can access the extracted values via .result (List[str]).

Installation

# 1. Install Playwright
pip install playwright

# 2-1. Install Chromium (macOS / Windows)
python -m playwright install chromium

# 2-2. Install Chromium (Linux)
python -m playwright install --with-deps chromium

# 3. Install pw-simple-scraper
pip install pw-simple-scraper

Since this scraper is built on top of Playwright, both the Playwright library and the Chromium browser are required.

Usage

from pw-simple_scraper import scrape_context, scrape_attrs

# Extract text
res = scrape_context("https://example.com", "h3")
print(res.result)   # ['h3-type-content1', 'h3-type-content2', ...]
print(res.count)    # n (number of scraped elements)

# Extract links by Attribute (herf ...)
links = scrape_attr("https://example.com", "a", "herf")
print(links.result) # ['https://www.iana.org/domains/example', ...]

# Apply timeout option (default: 30 seconds)
scrape_context("https://example.com", "something", timeout=10) # 10 seconds
links = scrape_attr("https://example.com", "a", "herf", timeout=20) # 20 seconds

Result is a `ScrapeResult` object

@dataclass
class ScrapeResult:
    url: str
    selector: str
    result: List[str]       # Extracted values
    count: int              # Number of values
    fetched_at: datetime    # Execution timestamp (UTC)

FAQ

Installed but browser fails to launch
- You must install the browser with python -m playwright install chromium (Be mindful of the Linux --with-deps option.)
RuntimeError: All strategies failed
- This may happen if the selector doesn’t exist or the page loads slowly. Double-check your selector and try increasing the timeout.
Scraping inside iframe
- Planned for future support.
xpath support
- Planned for future support.
robot.txt support
- Will be added as a configurable option in the future.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

elecbrandy

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.4

Oct 12, 2025

0.1.3

Sep 22, 2025

This version

0.1.2

Sep 16, 2025

0.1.1

Sep 12, 2025

0.1.0

Aug 21, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pw_simple_scraper-0.1.2.tar.gz (16.4 kB view details)

Uploaded Sep 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pw_simple_scraper-0.1.2-py3-none-any.whl (10.9 kB view details)

Uploaded Sep 16, 2025 Python 3

File details

Details for the file pw_simple_scraper-0.1.2.tar.gz.

File metadata

Download URL: pw_simple_scraper-0.1.2.tar.gz
Upload date: Sep 16, 2025
Size: 16.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pw_simple_scraper-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`af1f5c925ffc399205a8e23de85b23744680c5d517d484052db1ed91ca674023`
MD5	`85041c43589156f67f4ce46634a812fc`
BLAKE2b-256	`770c8cb06162b5a59dd3b70a51fb1a2d9e920d1c2e098410d3bfddd8dc36659e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pw_simple_scraper-0.1.2.tar.gz:

Publisher: release.yml on elecbrandy/pw-simple-scraper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pw_simple_scraper-0.1.2.tar.gz
- Subject digest: af1f5c925ffc399205a8e23de85b23744680c5d517d484052db1ed91ca674023
- Sigstore transparency entry: 523015054
- Sigstore integration time: Sep 16, 2025
Source repository:
- Permalink: elecbrandy/pw-simple-scraper@f07c6084baa635c6ca6b226850d199d404bf1b05
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/elecbrandy
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@f07c6084baa635c6ca6b226850d199d404bf1b05
- Trigger Event: push

File details

Details for the file pw_simple_scraper-0.1.2-py3-none-any.whl.

File metadata

Download URL: pw_simple_scraper-0.1.2-py3-none-any.whl
Upload date: Sep 16, 2025
Size: 10.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pw_simple_scraper-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d56ff6e66044eadfee4b08fafa437ea773db4365a08093c0ba17b868997d3706`
MD5	`1ce2409e8cbc40552ee20daf02f201e0`
BLAKE2b-256	`c2975e3231261a22153a96d6a7d03a291976296fce24c87fd3d8817d978332a6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pw_simple_scraper-0.1.2-py3-none-any.whl:

Publisher: release.yml on elecbrandy/pw-simple-scraper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pw_simple_scraper-0.1.2-py3-none-any.whl
- Subject digest: d56ff6e66044eadfee4b08fafa437ea773db4365a08093c0ba17b868997d3706
- Sigstore transparency entry: 523015072
- Sigstore integration time: Sep 16, 2025
Source repository:
- Permalink: elecbrandy/pw-simple-scraper@f07c6084baa635c6ca6b226850d199d404bf1b05
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/elecbrandy
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@f07c6084baa635c6ca6b226850d199d404bf1b05
- Trigger Event: push

pw-simple-scraper 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

pw-simple-scraper

Overview

Installation

Usage

Result is a `ScrapeResult` object

FAQ

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

pw-simple-scraper 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

pw-simple-scraper

Overview

Installation

Usage

Result is a ScrapeResult object

FAQ

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Result is a `ScrapeResult` object