Skip to main content

A python package to get amazon product and search data in json form. The package does not require any API keys as it works by scraping the amazon page.

Project description

amazondata

PyPI version

A python package to get amazon product and search data in json form. The package does not require any API keys as it works by scraping the amazon page.

Reference: How To Scrape Amazon Product Details and Pricing using Python

Install

pip install amazondata

Usage

To get Amazon product details from the url, use the following function.

get_product_from_url(url)

from amazondata.product_details_extractor import ProductDetailsExtractor

product_details_extractor = ProductDetailsExtractor()

data = product_details_extractor.get_product_from_url('https://www.amazon.in/dp/B09JSYVNZ2')

print(data)

To get Amazon product details from the ASIN (Amazon Standard Identification Number) code, use the following function.

get_product_from_asin_code(asin_code)

from amazondata.product_details_extractor import ProductDetailsExtractor

product_details_extractor = ProductDetailsExtractor()

data = product_details_extractor.get_product_from_asin_code('B09JSYVNZ2')

print(data)

To get the list of products from search query use the following function

search(query, page)

from amazondata.search_result_extractor import SearchResultExtractor

search_result_extractor = SearchResultExtractor()

data = search_result_extractor.search('perfume for men', 3)

print(data)

NOTE: Optionally, you can pass custom headers to all these functions. The default headers value is:

headers = {
            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
            "Sec-Fetch-Site": "none",
            "Host": "www.amazon.in",
            "Accept-Language": "en-IN,en-GB;q=0.9,en;q=0.8",
            "Sec-Fetch-Mode": "navigate",
            "Accept-Encoding": "gzip, deflate, br",
            "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Safari/605.1.15",
            "Connection": "keep-alive",
            "Upgrade-Insecure-Requests": "1",
            "Sec-Fetch-Dest": "document",
            "Priority": "u=0, i",
        }

In case the the scraper gets blocked from Amazon, you can fetch the html code using selenium and pass the html code to the following function

data = extract_search_results(html_code)
data = extract_product_details(html_code)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

amazondata-0.1.3.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

amazondata-0.1.3-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file amazondata-0.1.3.tar.gz.

File metadata

  • Download URL: amazondata-0.1.3.tar.gz
  • Upload date:
  • Size: 5.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for amazondata-0.1.3.tar.gz
Algorithm Hash digest
SHA256 3a2f944715bf9f3dc1bc611b56490365c654c95db881f583bc62e5707af56936
MD5 3d0dbc7704c586a234a54401db1ced17
BLAKE2b-256 750c67f3b76324d6ab07cb3dd023fe60e99044761ae491caa9ed0de2fb5fef61

See more details on using hashes here.

File details

Details for the file amazondata-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: amazondata-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for amazondata-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7d178b2d004a8df975a01d649795541b527b78f839c146bec1488c65310313bf
MD5 7557a08cf1735cb00f5598f04cb0441a
BLAKE2b-256 67ab6375c5039dc28faf5d47bbec58ab618b977c6636b558c2072776150ce441

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page