A Python package for scraping images from Pinterest
Project description
Pinterest Scrapper
A Python package for scraping images from Pinterest using Playwright with CLI functionality.
Features
- Search Pinterest for images with customizable queries
- Authenticate to access additional content (optional)
- Save search results as JSON files with image URLs, pin URLs, and descriptions
- Download images with customizable naming schemes
- Generate HTML galleries to view downloaded images
- Rate limit handling and retry mechanisms
- Command-line interface for easy usage
- Persistent browser sessions for better performance
Installation
Prerequisites
- Python 3.8 or higher
- pip (Python package manager)
Install from PyPI
pip install pinterest-scrapper
Install from source
git clone https://github.com/hanspaa2017108/pinterest-scraper.git
pip install -e .
Install Playwright browsers
After installing the package, you need to install the Playwright browsers:
playwright install chromium
Usage
Command Line Interface
Interactive Mode
pinterest-scrapper interactive
This will guide you through the scraping process with interactive prompts.
Direct Scraping
pinterest-scrapper scrape "indoor plants" --max-pins 200 --download --category plants
Full Options
pinterest-scrapper scrape "search query" \
--max-pins 300 \ # Maximum number of pins to collect
--max-scrolls 50 \ # Maximum number of page scrolls
--scroll-pause 2.0 \ # Pause time between scrolls in seconds
--output "my_output_dir" \ # Custom output directory
--user-data-dir "browser_data" \ # Directory for browser user data
--email "your@email.com" \ # Pinterest account email for login
--password "yourpassword" \ # Pinterest account password for login
--download \ # Download images
--category "custom_name" \ # Filename prefix for downloaded images
--limit 50 # Limit number of images to download
Python API
import asyncio
from pinterest_scrapper import PinterestScraper
async def main():
# Initialize the scraper
scraper = await PinterestScraper().initialize()
try:
# Optional: Log in to Pinterest
await scraper.login("your@email.com", "yourpassword")
# Scrape search results
results = await scraper.scrape_search(
query="home decor",
max_pins=100,
output_dir="pinterest_data"
)
# Download images
if results:
await scraper.download_images(
data=results,
folder="pinterest_data/images",
category_name="home_decor",
limit=50
)
# Create HTML gallery
await scraper.create_image_index(
folder="pinterest_data/images",
category_name="home_decor",
total_images=50
)
finally:
# Close the browser
await scraper.close()
if __name__ == "__main__":
asyncio.run(main())
Notes
- Pinterest may change its website structure over time, which could break this scraper.
- Use this package responsibly and respect Pinterest's terms of service.
- Add delays between requests to avoid IP bans.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pinterest_scrapper-1.tar.gz.
File metadata
- Download URL: pinterest_scrapper-1.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
699e63e8f31628f3f5d595e77ef0bce10f8fa90af2742e9e62c9ba7e396f6f3c
|
|
| MD5 |
c627b34a1ff917c8d0a07509b1325861
|
|
| BLAKE2b-256 |
ac77e99068f0e20d8692e9bc057b026459aaed63de947a2fed38cd0015dea1e2
|
File details
Details for the file pinterest_scrapper-1-py3-none-any.whl.
File metadata
- Download URL: pinterest_scrapper-1-py3-none-any.whl
- Upload date:
- Size: 14.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
641724a246d005c217e7f420d35021b649558a6ef77b5ff04f613207599c2346
|
|
| MD5 |
f554ebd78bd92cf43ab51fee52b98707
|
|
| BLAKE2b-256 |
5d08d514c49b2a26c2bfe653f629aee80801b68361ae75ea24a39b84b4352016
|