Skip to main content

No project description provided

Project description

Amazon Product Scraper

This project is a powerful web scraping tool designed to extract data from Amazon. Whether you're looking to gather details about a specific product, collect lists of products based on search keywords, or fetch product listings from a direct URL — this scraper handles it all, including automatic CAPTCHA solving.


🔍 Features

  • Search by keyword: Provide a search term and specify how many pages to scrape. It will return all matching products from the given number of pages.
  • Get product details: Supply a product URL and receive detailed information like:
    • Title
    • Price
    • Description
    • Features
    • Rating
    • Number of reviews
  • Extract product list by link: Given a category or listing page URL, it fetches all the product entries up to the page limit.
  • Automatic CAPTCHA Bypass: Solves Amazon CAPTCHAs automatically to allow seamless scraping.

🚀 Technologies Used

  • Selenium: For browser automation and interaction with dynamic content.
  • BeautifulSoup: For parsing and extracting data from HTML content.
  • Pillow (PIL): Used to process and solve CAPTCHA images.

📖 How to Use

1. Initialize the Scraper

from amazon_scraper import AmazonScraper

scraper = AmazonScraper()  # Initializes and runs the Chrome driver

2. Solve CAPTCHA

scraper.bypass_captcha()

When you see the success message, the CAPTCHA is solved and you can proceed to use the other methods.

3. Search Products by Keyword

results = scraper.get_product_by_search("laptop", page_limit=2)

This will return a dictionary of products found in the first 2 pages for the search term "laptop".

4. Get Product List by Link

product_list = scraper.get_product_list_by_link("https://www.amazon.com/s?k=smartphones", page_limit=2)

Scrapes product listings from the given URL up to 2 pages.

5. Get Detailed Product Info

product_details = scraper.get_detail_product_by_link("https://www.amazon.com/dp/B0...example")

Returns detailed product information such as title, price, rating, features, and more.


🙏 Support and Contributions

If you have a feature request or find a bug, feel free to open an issue or pull request on GitHub. I’m actively maintaining this project and happy to improve it based on your feedback.

If you find this project helpful, please consider giving it a ⭐ on GitHub — it means a lot!


Happy Scraping! 🤖

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

master_scramazon-0.1.0.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

master_scramazon-0.1.0-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file master_scramazon-0.1.0.tar.gz.

File metadata

  • Download URL: master_scramazon-0.1.0.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.0

File hashes

Hashes for master_scramazon-0.1.0.tar.gz
Algorithm Hash digest
SHA256 25c3d5eef77dc0c7629abf9f131fda980d07c44f63bb514ab6dda2b8d995cc3a
MD5 7631ab6eafaeaf4c017b28e2e894082e
BLAKE2b-256 5e7fc27f5bbd453e7ac47913553ea41a4c1531d241783a4733b1cca6202e1ecb

See more details on using hashes here.

File details

Details for the file master_scramazon-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for master_scramazon-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ca56693073960e23a65f386f7e3ffe6e12ea58c33f0ddb494ade29fefd059129
MD5 525c817ee9476cb0eca4a75943505a37
BLAKE2b-256 192fd790f0be1ea2cfe6ad69f4753653ad3e6a0be6f71c28f57cbc7210fc4901

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page