Skip to main content

Scrape any WordPress/WooCommerce store - products, categories & metadata from the public REST API. SDK + CLI + REST API.

Project description

wpscrape

Scrape any WordPress/WooCommerce store - products, categories & metadata from the public REST API.

PyPI version Python License: MIT Tests


No API key or authentication required. Uses the public wc/store/v1 endpoint designed for frontend access.

Features

  • SDK - Python client with full type hints and dataclass models
  • CLI - Rich terminal output with tables, search, filtering, and export
  • REST API - FastAPI server with auto-generated OpenAPI docs
  • Export - JSON and CSV export out of the box
  • Pagination - Auto-pagination or manual page-by-page control
  • Proxy support - Route requests through any HTTP proxy
  • Retry logic - Exponential backoff with jitter on transient failures

Install

pip install wpscrape

With CLI support (rich tables):

pip install wpscrape[cli]

With REST API server:

pip install wpscrape[api]

Everything:

pip install wpscrape[all]

Quick Start

Python SDK

from wpscrape import WordPress, Exporter

wp = WordPress("boskistores.com")

# Site metadata
site = wp.site_info()
print(site.name, site.has_woocommerce)

# All products (auto-paginates)
products = wp.products()
for p in products:
    print(p.name, p.price, p.currency)

# Search products
results = wp.search("smartwatch")

# Filter by category
earbuds = wp.category_products("earbuds")

# Single product by slug
product = wp.product("12v-router-power-bank")

# Paginated access
page = wp.products_page(page=1, per_page=10)
print(page.total, page.has_next)

# All categories
categories = wp.categories()

# Export to files
exporter = Exporter()
exporter.products_to_json(products)
exporter.products_to_csv(products)
exporter.categories_to_json(categories)

CLI

# Scrape all products
wpscrape products boskistores.com

# Search products
wpscrape products boskistores.com --search smartwatch

# Filter by category
wpscrape products boskistores.com --category earbuds

# Paginated output
wpscrape products boskistores.com --page 1 --limit 10

# JSON output
wpscrape products boskistores.com --json

# Save to file
wpscrape products boskistores.com --save products.csv

# Categories
wpscrape categories boskistores.com

# Site info
wpscrape site boskistores.com

# Use a proxy
wpscrape --proxy http://user:pass@host:8080 products boskistores.com

REST API

# Start the API server
wpscrape serve

# With custom host/port
wpscrape serve --host 0.0.0.0 --port 3000

Endpoints:

Method Path Description
GET /api/v1/products?domain=... List/search products
GET /api/v1/products/{slug}?domain=... Single product
GET /api/v1/categories?domain=... List categories
GET /api/v1/site?domain=... Site metadata
GET /health Health check
GET /docs OpenAPI docs

Configuration

Proxy

wp = WordPress("store.com", proxy="http://user:pass@host:8080")

Or via environment variable:

export WPSCRAPE_PROXY=http://user:pass@host:8080
wpscrape products store.com

Timeout & Retries

wp = WordPress("store.com", timeout=60.0, max_retries=5)

Models

All data is returned as typed dataclasses:

  • SiteInfo - site name, description, URL, namespaces, WooCommerce detection
  • Product - full product data with prices, images, categories, attributes, variations
  • Category - category with product count, parent, image
  • PaginatedResponse - page metadata with has_next / has_previous

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wpscrape-0.1.0.tar.gz (17.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wpscrape-0.1.0-py3-none-any.whl (24.1 kB view details)

Uploaded Python 3

File details

Details for the file wpscrape-0.1.0.tar.gz.

File metadata

  • Download URL: wpscrape-0.1.0.tar.gz
  • Upload date:
  • Size: 17.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for wpscrape-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3c0f86daeec9f692c0bd2a678c213d4d85e63c061cf196ead4d92b0efb8d2f42
MD5 323d1a4b64cf0760f03a9815af46f75d
BLAKE2b-256 43bfedb8ddefc536ad9661b6d1147dfff4d8e9a35f62b904d012da8145e0ba9a

See more details on using hashes here.

File details

Details for the file wpscrape-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: wpscrape-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for wpscrape-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 404aa12962524fa5da0d32c1c21d73329f00cd65c513d82341e996fa31d18d43
MD5 471e584c712bc01975d2d70cfdbf18c5
BLAKE2b-256 e2bf633fa026003530f033f011deedfdb76cf0f9d3cacadd45f5729185798949

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page