Skip to main content

Scrape any WordPress/WooCommerce store - products, categories & metadata from the public REST API. SDK + CLI + REST API.

Project description

wpscrape

Scrape any WordPress/WooCommerce store - products, categories & metadata from the public REST API.

PyPI version Python License: MIT Tests


No API key or authentication required. Uses the public wc/store/v1 endpoint designed for frontend access.

Features

  • SDK - Python client with full type hints and dataclass models
  • CLI - Rich terminal output with tables, search, filtering, and export
  • REST API - FastAPI server with auto-generated OpenAPI docs
  • Export - JSON and CSV export out of the box
  • Pagination - Auto-pagination or manual page-by-page control
  • Proxy support - Route requests through any HTTP proxy
  • Retry logic - Exponential backoff with jitter on transient failures

Install

pip install wpscrape

With CLI support (rich tables):

pip install wpscrape[cli]

With REST API server:

pip install wpscrape[api]

Everything:

pip install wpscrape[all]

Quick Start

Python SDK

from wpscrape import WordPress, Exporter

wp = WordPress("boskistores.com")

# Site metadata
site = wp.site_info()
print(site.name, site.has_woocommerce)

# All products (auto-paginates)
products = wp.products()
for p in products:
    print(p.name, p.price, p.currency)

# Search products
results = wp.search("smartwatch")

# Filter by category
earbuds = wp.category_products("earbuds")

# Single product by slug
product = wp.product("12v-router-power-bank")

# Paginated access
page = wp.products_page(page=1, per_page=10)
print(page.total, page.has_next)

# All categories
categories = wp.categories()

# Export to files
exporter = Exporter()
exporter.products_to_json(products)
exporter.products_to_csv(products)
exporter.categories_to_json(categories)

CLI

# Scrape all products
wpscrape products boskistores.com

# Search products
wpscrape products boskistores.com --search smartwatch

# Filter by category
wpscrape products boskistores.com --category earbuds

# Paginated output
wpscrape products boskistores.com --page 1 --limit 10

# JSON output
wpscrape products boskistores.com --json

# Save to file
wpscrape products boskistores.com --save products.csv

# Categories
wpscrape categories boskistores.com

# Site info
wpscrape site boskistores.com

# Use a proxy
wpscrape --proxy http://user:pass@host:8080 products boskistores.com

REST API

# Start the API server
wpscrape serve

# With custom host/port
wpscrape serve --host 0.0.0.0 --port 3000

Endpoints:

Method Path Description
GET /api/v1/products?domain=... List/search products
GET /api/v1/products/{slug}?domain=... Single product
GET /api/v1/categories?domain=... List categories
GET /api/v1/site?domain=... Site metadata
GET /health Health check
GET /docs OpenAPI docs

Configuration

Proxy

wp = WordPress("store.com", proxy="http://user:pass@host:8080")

Or via environment variable:

export WPSCRAPE_PROXY=http://user:pass@host:8080
wpscrape products store.com

Timeout & Retries

wp = WordPress("store.com", timeout=60.0, max_retries=5)

Models

All data is returned as typed dataclasses:

  • SiteInfo - site name, description, URL, namespaces, WooCommerce detection
  • Product - full product data with prices, images, categories, attributes, variations
  • Category - category with product count, parent, image
  • PaginatedResponse - page metadata with has_next / has_previous

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wpscrape-0.1.1.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wpscrape-0.1.1-py3-none-any.whl (24.0 kB view details)

Uploaded Python 3

File details

Details for the file wpscrape-0.1.1.tar.gz.

File metadata

  • Download URL: wpscrape-0.1.1.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for wpscrape-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8c2e026d87a47456913b9adcaff119d3e27afb67749613acb6a21af2966f4a09
MD5 1846e9bba6dbdb9fc76bbbcdd4e8e1b0
BLAKE2b-256 f3d1aa72d240fffde4efe7a4f0e6a74094a0bb1b01418ab43d65ebab13fa5909

See more details on using hashes here.

Provenance

The following attestation bundles were made for wpscrape-0.1.1.tar.gz:

Publisher: publish.yml on zaidkx37/wpscrape

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file wpscrape-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: wpscrape-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 24.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for wpscrape-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e67998aa38a462d763015fbb270a40347ff195f54899a08e8249108597fa41d0
MD5 63f6318987ea82b723a75c553616dfae
BLAKE2b-256 37a475e21db5206f355c9dce69ddccac49ec8d3a79a94d97ba142075335590b7

See more details on using hashes here.

Provenance

The following attestation bundles were made for wpscrape-0.1.1-py3-none-any.whl:

Publisher: publish.yml on zaidkx37/wpscrape

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page