Skip to main content

A Parser for Amazon Pages

Project description

AmazonParser

Python Library for Parsing Amazon Pages

Description

AmazonParser is a Python library designed to parse product information from Amazon product pages. It extracts useful data such as product title, price, ratings, and more. It's designed to scrape data mostly by XPath and RegEx. This design helps to be more modular and configable.

Prerequisites

  • Python 3.6 or higher
  • lxml library: pip install lxml

Installation

You can install the library using pip:

pip install AmazonParser

Usage

Here is an example of how to use the AmazonParser module:

from amazonparser import AmazonParser

# Create an instance of the parser
parser = AmazonParser()

# Parse a product page
path = 'tests/archives/page-ASIN.html'
html = AmazonAEProductPageParser.get_html_from_file(path)
product_data = AmazonAEProductPageParser(html=html, base_url="https://www.amazon.ae/")

# Print the parsed data
print(product_data.get_product_details())

Example Output

The get_product_details method returns a dictionary with the following structure:

{'best_sellers_rank': [{'category': 'Mobile Phones & Communication Products',
                        'category_url': 'https://...',
                        'rank': 5},
                       {'category': 'Mobile Phone Screen Protectors',
                        'category_url': 'https://...',
                        'rank': 2}],
 'bought_past_mounth': '500+',
 'brand': 'JETech',
 'bullet_points': 'STRING',
 'customers_reviews': {'count': 21049, 'rate': 4.3},
 'date_first_available': datetime.date(2024, 8, 6),
 'image': 'https://m.media-amazon.com/images/I/71B7WFLtovL._AC_SL1500_.jpg',
 'price': {'currency': 'AED', 'value': 30.99},
 'product_bundles': {'B09BVR4LFY': 'iPhone 13/13 Pro 6.1-Inch',
                     'B09BZ2YD6F': 'iPhone 13 Pro Max 6.7-Inch',
                     'B0B2L6R586': 'iPhone 12/12 Pro 6.1-Inch',
                     'B0B2RQP8MK': 'iPhone 12 Pro Max 6.7-Inch',
                     'B0DBZNC8DL': 'iPhone 16 Pro 6.3-Inch',
                     'B0DBZPXJRH': 'iPhone 16 Pro Max 6.9-Inch',
                     'B0DBZQ2WR3': 'iPhone 16 Plus 6.7-Inch',
                     'B0DBZR3TX7': 'iPhone 16 6.1-Inch'},
 'seller_detail': {'seller_id': 'A11TDSN2MJL3GW',
                   'seller_name': 'JE Products AE',
                   'seller_profile_url': 'https://www.amazon.ae/sp/?seller=A11TDSN2MJL3GW'},
 'stock_availability': {'quantity': 50, 'status': True},
 'title': 'JETech Screen Protector for iPhone 16 Pro Max 6.9-Inch, Tempered '
          'Glass Film with Easy Installation Tool, Case-Friendly, HD Clear, '
          '3-Pack'}

Contributing

Contributions are welcome! Please open an issue or submit a pull request on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

amazonparser-0.1.6.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

AmazonParser-0.1.6-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file amazonparser-0.1.6.tar.gz.

File metadata

  • Download URL: amazonparser-0.1.6.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for amazonparser-0.1.6.tar.gz
Algorithm Hash digest
SHA256 4b192b83de7695ab734608dc93d2683b93657cc9bd6315763d2478b7958a0a25
MD5 83288826bbc89213f328df3da93ca3fc
BLAKE2b-256 0c55f02267fb6622d4009c5a597c10185ae3d7416ca19ddf15a2fc963be0d456

See more details on using hashes here.

Provenance

The following attestation bundles were made for amazonparser-0.1.6.tar.gz:

Publisher: python-publish.yml on a4fr/AmazonParser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file AmazonParser-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: AmazonParser-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for AmazonParser-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 2f501543cda775512d740847a4c17bd78c13e9c1bd40b30f5db630635ad2895d
MD5 8cbd54173f280d62be760d9e17772d85
BLAKE2b-256 dfbf9d89c50f307b5cf8d319b08bda63c3a0841fe3553d56f887f36eb1a6b6e1

See more details on using hashes here.

Provenance

The following attestation bundles were made for AmazonParser-0.1.6-py3-none-any.whl:

Publisher: python-publish.yml on a4fr/AmazonParser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page