Skip to main content

A lightweight Python library that scrapes high-quality images from any public Pinterest board using the official RSS feed

Project description

PinGrabber

PyPI version Python 3.7+ License: MIT

PinGrabber is a lightweight Python library that scrapes high‑quality images from any public Pinterest board using the official RSS feed provided by Pinterest. It extracts the original, full‑resolution images and downloads them to your local machine with minimal effort.

Pinterest Banner

Important – Use this tool only with public boards and for personal or educational purposes. Always respect Pinterest’s Terms of Service and the copyright of image authors.


Table of Contents


Features

  • Simple shortcut function – download an entire board or a single pin with one line of code.
  • Automatic conversion of thumbnail URLs to original high‑resolution images.
  • Customizable output directory.
  • Both high‑level wrapper and low‑level class for fine‑grained control.
  • Keyword search that returns raw image URLs without downloading.
  • Built with requests, BeautifulSoup, and lxml for fast and reliable parsing.

Installation

You can install PinGrabber directly from the Pypi repository:

pip install pingrabber

or GitHub repository:

pip install git+https://github.com/VVui-blip/image-data-scraping-resource-pack-from-Pinterest-.git

Alternatively, if you have the source code locally:

pip install -r requirements.txt
pip install .

The required dependencies (requests, beautifulsoup4, lxml) will be installed automatically.

Optional dependency for better keyword search

For a more robust and stable keyword search, we recommend installing the ddgs package (community‑maintained, formerly duckduckgo-search):

pip install ddgs

Or install it together with PinGrabber:

pip install .[search]

ddgs provides a more reliable way to query search engines (DuckDuckGo, Bing, Google, etc.) and supports proxy configuration right in the code. Without ddgs, the search() function will fall back to direct requests calls to search engines, which are more prone to blocking.


Quick Start

pingrabber.download(url) automatically detects whether the given URL is a board or a single pin and handles it accordingly – you don't need to differentiate manually.

Download all images from a board

import pingrabber

pingrabber.download("https://www.pinterest.com/username/boardname/")

Download a single pin

import pingrabber

pingrabber.download("https://www.pinterest.com/pin/119134352618387326/")

To save images to a custom directory:

import pingrabber

pingrabber.download(
    "https://www.pinterest.com/username/boardname/",
    output_dir="my_pinterest_images"
)

Search images by keyword (returns raw links, no download)

import pingrabber

links = pingrabber.search("nature")
for url in links:
    print(url)

This function does not download any images – it only returns a list of high‑quality raw image URLs. You can preview them, filter, or download them manually with requests if needed.

How it works: search() tries multiple search engines (DuckDuckGo HTML, DuckDuckGo Lite, Bing) using the site:pinterest.com query. It rotates user‑agents, retries, and adds delays between attempts to reduce the chance of being blocked. If one engine fails (403/429), it automatically switches to the next. It does not scrape Pinterest’s search page directly, because that page requires JavaScript rendering which requests cannot handle.

Customise the number of boards to scan, images per board, retries, and delay:

links = pingrabber.search(
    "nature",
    max_boards=5,
    max_images_per_board=10,
    max_retries=3,
    delay_seconds=2.5
)

If search() always returns empty: check the logged WARNING/ERROR messages. If you see 403/429 errors for all fallback engines, your network/IP is likely being rate‑limited by the search engines (common on cloud servers, VPNs, or IPs that have sent many requests). In that case:

· Install ddgs if you haven’t (pip install ddgs) – this is the most significant improvement. · If ddgs is installed but still returns nothing, try using a proxy directly with the package:

from ddgs import DDGS
with DDGS(proxy="socks5://127.0.0.1:9050", timeout=15) as ddgs:
    results = ddgs.text("site:pinterest.com nature", max_results=5)
    print(results)

If that returns results, you can initialise PinGrabber with a similarly proxied session, or simply use the found board URLs with download(). · Increase max_retries and delay_seconds (this only affects the fallback method). · Try a different network/VPN. · Alternatively, use the most reliable approach: find a board manually through your browser and call pingrabber.download(board_url) directly – this does not depend on search engines and is always stable.


Advanced Usage

For more control, use the PinGrabber class:

from pingrabber import PinGrabber

grabber = PinGrabber(timeout=30)

# Download all original images
saved_files = grabber.download(
    "https://www.pinterest.com/username/boardname/",
    output_dir="high_res_pins"
)

print(f"Downloaded {len(saved_files)} images")

If you only need the image URLs (without downloading):

from pingrabber import PinGrabber

grabber = PinGrabber()
rss_url = grabber.build_rss_url("https://www.pinterest.com/username/boardname/")
rss_content = grabber.fetch_rss(rss_url)
image_urls = grabber.extract_image_urls(rss_content)

for url in image_urls:
    print(url)

How It Works

  1. Board/Pin URL to RSS Feed – The provided URL is converted to an RSS feed URL (appending .rss for boards, or using the pin’s RSS endpoint).
  2. Fetch RSS – The RSS content is retrieved via a requests GET request.
  3. Parse and Extract – BeautifulSoup with the lxml parser extracts all tags inside the RSS items.
  4. Upgrade to Original – Thumbnail URLs (e.g., 236x) are transformed into originals URLs to fetch the highest available quality.
  5. Download – Each image is downloaded and saved to the specified output directory with a unique filename.

Project Structure

pin_grabber/
├── pingrabber/
│   ├── __init__.py          # Package entry point
│   └── core.py              # Main logic (PinGrabber class + helper functions)
├── requirements.txt         # Python dependencies
├── README.md                # This file
├── setup.py                 # Packaging configuration
└── LICENSE                  # MIT License

Dependencies

· Python 3.7+ · requests – HTTP requests. · beautifulsoup4 – HTML/XML parsing. · lxml – Fast XML/HTML parser.

All dependencies are listed in requirements.txt and will be installed when using pip install . or the Git install command.


License

This project is released under the MIT License. See the LICENSE file for details.

Disclaimer: This tool is provided “as is”. You are solely responsible for ensuring that your usage complies with Pinterest’s Terms of Service and applicable copyright laws.


Built with attention for developers who need a quick, clean Pinterest image scraper.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pingrabber-1.0.2.tar.gz (21.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pingrabber-1.0.2-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file pingrabber-1.0.2.tar.gz.

File metadata

  • Download URL: pingrabber-1.0.2.tar.gz
  • Upload date:
  • Size: 21.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for pingrabber-1.0.2.tar.gz
Algorithm Hash digest
SHA256 f5b0ab1217d09e73455d1f509c401c20d67129a7982a5fa9753b0b06d9df5750
MD5 817122ea4895771b2e90eec41f805f30
BLAKE2b-256 c5d06fce62d215068a7c39cee8fdc5c7adbc568b67fa09cab19200dfd45405eb

See more details on using hashes here.

File details

Details for the file pingrabber-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: pingrabber-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for pingrabber-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7daa96c9003f5a5abeaf4e6494732dd0f5251df664cf01425fe048721a1affc8
MD5 4df0c3c6658e293af7d5e1a880b4061b
BLAKE2b-256 bd8cfe542797e979ce6b644244794f42d01f572ec66c90c081b3e86105950320

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page