Ecommerce scrapers library
Project description
Ecommerce Retailer Scraper
The purpose of this project is to create a library of spiders/scraper for major online ecommerce retailer shop such as Amazon, Sephora, Walmart, and many other brands. The app will return product information and review as well in JSON format.
The extracted data you can use for market research, product design, consumer buying impulse...The sky is the limit.
Instruction:
- How to install:
pip install ecom-scraper
- How to use:
from retail_scraper.spiders.sephora import Sephora
# Sephora spider take either product url or productid Sephora(url=url, productid=product_id)
url = 'https://www.sephora.com/product/huda-beauty-liquid-matte-ultra-comfort-transfer-proof-lipstick-P479843'
product_id = 'P479843'
sephora = Sephora(url=url)
# Or
sephora = Sephora(productid=product_id)
sephora.scrap_product_info() # Instantiate the Scrap product function
info = sephora.product_info # product info and its variants will be stored in product_info
sephora.scrap_product_reviews() # Instantiate The Scrap product reviews function
reviews = sephora.product_reviews # All product reviews will be stored in product_reviews
Supported Scrapers
Add a New Spider or Feature
If you want to add a spider/scraper to the app or even a new feature please use the link bellow or open it as an issue in this github repo. Most upvoted feature will be added to the app.
Upcoming Scraper/Spiders
Contribution
You are most welcome to contribute to this project and create pull requests.
Credit
- @diemonster for all his comments, feedbacks and instruction.
- Everyone in the
#python
community in the libera IRC
Disclimar
This library is built for educational puposes ONLY, use at your own risk.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for retail_scraper-1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c3042bef764e0ddc54c9dfb206a652753c0799c207d7504abfeed0fa3077ddce |
|
MD5 | 13f90d301013789abec18288bf4365df |
|
BLAKE2b-256 | 427854980235a9fb208fcdad16a2b88b20db54b5ec0ef7bdd359482f5db783fa |