Skip to main content

Scraping and parsing amazon

Project description

an

Scraping and parsing amazon

To install: pip install an

Amazon Scraping Library

Overview

This Python library is designed for scraping and parsing data from Amazon product pages. It offers functionalities to extract various information like sales ranks, product reviews, and product titles from Amazon's different regional websites.

Installation

This library is not a standalone package and should be incorporated directly into your existing Python project. Copy the code into your project's directory.

Dependencies

  • pandas
  • numpy
  • requests
  • BeautifulSoup
  • pymongo
  • matplotlib

Ensure these dependencies are installed in your environment.

Usage

Extracting Sales Rank

The library can extract sales ranks of products from Amazon. Here's an example of how to get the sales rank of a product:

asin = 'YOUR_PRODUCT_ASIN'
country = 'co.uk'  # Change to desired Amazon region
sales_rank = Amazon.get_sales_rank(asin=asin, country=country)
print(sales_rank)

Parsing Product Title

To parse and get the product title from an Amazon product page:

html_content = Amazon.slurp(what='product_page', asin=asin, country=country)
title = Amazon.parse_product_title(html_content)
print(title)

Getting Number of Reviews

To retrieve the number of customer reviews for a product:

number_of_reviews = Amazon.get_number_of_reviews(asin=asin, country=country)
print(number_of_reviews)

Contributing

Contributions to this library are welcome. Please send pull requests with improvements or bug fixes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

an-0.0.7.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

an-0.0.7-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file an-0.0.7.tar.gz.

File metadata

  • Download URL: an-0.0.7.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for an-0.0.7.tar.gz
Algorithm Hash digest
SHA256 a2fef95f8701526d40903677084ff0a01a1896dceb9ddc44d6f4d97011fb17e8
MD5 a699a8651f485b0b4904063a44a8c355
BLAKE2b-256 6c0974f678b979b6a2a1ccb7626d2a82b0c1cdba49b1440469c56d9df1ad1b7d

See more details on using hashes here.

File details

Details for the file an-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: an-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for an-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 ff692ec27977f09002eacc844923580464afd45d6254819002538153821f3cd2
MD5 5c295c19afd01d6db220b19764f37897
BLAKE2b-256 2f61f4e1b32732ff9b1c313f300db0c5e8ef8894a2b2a28396cb2ce478327d81

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page