Scrape any WordPress/WooCommerce store - products, categories & metadata from the public REST API. SDK + CLI + REST API.
Project description
wpscrape
Scrape any WordPress/WooCommerce store - products, categories & metadata from the public REST API.
No API key or authentication required. Uses the public wc/store/v1 endpoint designed for frontend access.
Features
- SDK - Python client with full type hints and dataclass models
- CLI - Rich terminal output with tables, search, filtering, and export
- REST API - FastAPI server with auto-generated OpenAPI docs
- Export - JSON and CSV export out of the box
- Pagination - Auto-pagination or manual page-by-page control
- Proxy support - Route requests through any HTTP proxy
- Retry logic - Exponential backoff with jitter on transient failures
Install
pip install wpscrape
With CLI support (rich tables):
pip install wpscrape[cli]
With REST API server:
pip install wpscrape[api]
Everything:
pip install wpscrape[all]
Quick Start
Python SDK
from wpscrape import WordPress, Exporter
wp = WordPress("boskistores.com")
# Site metadata
site = wp.site_info()
print(site.name, site.has_woocommerce)
# All products (auto-paginates)
products = wp.products()
for p in products:
print(p.name, p.price, p.currency)
# Search products
results = wp.search("smartwatch")
# Filter by category
earbuds = wp.category_products("earbuds")
# Single product by slug
product = wp.product("12v-router-power-bank")
# Paginated access
page = wp.products_page(page=1, per_page=10)
print(page.total, page.has_next)
# All categories
categories = wp.categories()
# Export to files
exporter = Exporter()
exporter.products_to_json(products)
exporter.products_to_csv(products)
exporter.categories_to_json(categories)
CLI
# Scrape all products
wpscrape products boskistores.com
# Search products
wpscrape products boskistores.com --search smartwatch
# Filter by category
wpscrape products boskistores.com --category earbuds
# Paginated output
wpscrape products boskistores.com --page 1 --limit 10
# JSON output
wpscrape products boskistores.com --json
# Save to file
wpscrape products boskistores.com --save products.csv
# Categories
wpscrape categories boskistores.com
# Site info
wpscrape site boskistores.com
# Use a proxy
wpscrape --proxy http://user:pass@host:8080 products boskistores.com
REST API
# Start the API server
wpscrape serve
# With custom host/port
wpscrape serve --host 0.0.0.0 --port 3000
Endpoints:
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/products?domain=... |
List/search products |
| GET | /api/v1/products/{slug}?domain=... |
Single product |
| GET | /api/v1/categories?domain=... |
List categories |
| GET | /api/v1/site?domain=... |
Site metadata |
| GET | /health |
Health check |
| GET | /docs |
OpenAPI docs |
Configuration
Proxy
wp = WordPress("store.com", proxy="http://user:pass@host:8080")
Or via environment variable:
export WPSCRAPE_PROXY=http://user:pass@host:8080
wpscrape products store.com
Timeout & Retries
wp = WordPress("store.com", timeout=60.0, max_retries=5)
Models
All data is returned as typed dataclasses:
SiteInfo- site name, description, URL, namespaces, WooCommerce detectionProduct- full product data with prices, images, categories, attributes, variationsCategory- category with product count, parent, imagePaginatedResponse- page metadata withhas_next/has_previous
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wpscrape-0.1.1.tar.gz.
File metadata
- Download URL: wpscrape-0.1.1.tar.gz
- Upload date:
- Size: 18.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c2e026d87a47456913b9adcaff119d3e27afb67749613acb6a21af2966f4a09
|
|
| MD5 |
1846e9bba6dbdb9fc76bbbcdd4e8e1b0
|
|
| BLAKE2b-256 |
f3d1aa72d240fffde4efe7a4f0e6a74094a0bb1b01418ab43d65ebab13fa5909
|
Provenance
The following attestation bundles were made for wpscrape-0.1.1.tar.gz:
Publisher:
publish.yml on zaidkx37/wpscrape
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wpscrape-0.1.1.tar.gz -
Subject digest:
8c2e026d87a47456913b9adcaff119d3e27afb67749613acb6a21af2966f4a09 - Sigstore transparency entry: 1191046327
- Sigstore integration time:
-
Permalink:
zaidkx37/wpscrape@9831a67aed1cd637660003d4c3d5709142fccb9b -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/zaidkx37
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9831a67aed1cd637660003d4c3d5709142fccb9b -
Trigger Event:
release
-
Statement type:
File details
Details for the file wpscrape-0.1.1-py3-none-any.whl.
File metadata
- Download URL: wpscrape-0.1.1-py3-none-any.whl
- Upload date:
- Size: 24.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e67998aa38a462d763015fbb270a40347ff195f54899a08e8249108597fa41d0
|
|
| MD5 |
63f6318987ea82b723a75c553616dfae
|
|
| BLAKE2b-256 |
37a475e21db5206f355c9dce69ddccac49ec8d3a79a94d97ba142075335590b7
|
Provenance
The following attestation bundles were made for wpscrape-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on zaidkx37/wpscrape
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wpscrape-0.1.1-py3-none-any.whl -
Subject digest:
e67998aa38a462d763015fbb270a40347ff195f54899a08e8249108597fa41d0 - Sigstore transparency entry: 1191046330
- Sigstore integration time:
-
Permalink:
zaidkx37/wpscrape@9831a67aed1cd637660003d4c3d5709142fccb9b -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/zaidkx37
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9831a67aed1cd637660003d4c3d5709142fccb9b -
Trigger Event:
release
-
Statement type: