Skip to main content

Free Trial Amazon Scraper API for extracting search, product, offer listing, reviews, question and answers, best sellers and sellers data.

Project description

Amazon Scraper

Amazon_scraper (1)

Oxylabs' Amazon Scraper API allows users to easily scrape publicly-available data from any page on Amazon, such as reviews, pricing, product information and more. If you're interested in testing out this powerful tool, you can sign up for a free trial on the Oxylabs website.

Overview

Below is a quick overview of all the available data source values we support with Amazon.

Source Description Structured data
amazon Submit any Amazon URL you like. Depends on the URL.
amazon_bestsellers List of best seller items in a taxonomy node of your choice. Yes
amazon_pricing List of offers available for an ASIN of your choice. Yes.
amazon_product Product page of an ASIN of your choice. Yes.
amazon_questions Q&A page of an ASIN of your choice. Yes.
amazon_reviews Reviews page of an ASIN of your choice. Yes.
amazon_search Search results for a search term of your choice. Yes.
amazon_sellers Seller information of a seller of your choice. Yes.

URL

The amazon source is designed to retrieve the content from various Amazon URLs. Instead of sending multiple parameters, you can provide us with a direct URL to the required Amazon page. We do not strip any parameters or alter your URLs in any way.

Query parameters

Parameter Description Default Value
source Data source. More info. N/A
url Direct URL (link) to Amazon page -
user_agent_type Device type and browser. The full list can be found here. desktop
render Enables JavaScript rendering. More info. -
callback_url URL to your callback endpoint. More info. -
parse true will return structured data, as long as the URL submitted is for one of the page types we can parse. false

- required parameter

Python code example

In the code example below, we make a request to retrieve the Amazon product page for B0BDJ279KF .

import requests
from pprint import pprint


# Structure payload.
payload = {
    'source': 'amazon',
    'url': 'https://www.amazon.co.uk/dp/B0BDJ279KF',
    'parse': True
}

# Get response.
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('YOUR_USERNAME', 'YOUR_PASSWORD'), #Your credentials go here
    json=payload,
)

# Instead of response with job status and results url, this will return the
# JSON response with results.
pprint(response.json())

To see the response example with retrieved data, download this sample output in JSON format.

Search

The amazon_search source is designed to retrieve Amazon search result pages.

Query parameters

Parameter Description Default Value
source Data source. More info. amazon_search
domain Domain localization for Amazon. The full list of available domains can be found here. com
query UTF-encoded keyword -
start_page Starting page number 1
pages Number of pages to retrieve 1
geo_location The Deliver to location. See our guide to using this parameter here. -
user_agent_type Device type and browser. The full list can be found here. desktop
render Enables JavaScript rendering. More info. -
callback_url URL to your callback endpoint. More info. -
parse true will return structured data. -

context:
category_id

Search for items in a particular browse node (product category). -

context:
merchant_id

Search for items sold by a particular seller. -

- required parameter

Python code example

In the code example below, we make a request to retrieve product page for ASIN 3AA17D2BRD4YMT0X on amazon.nl marketplace. In case the ASIN provided is a parent ASIN, we ask Amazon to return a product page of an automatically-selected variation.

import requests
from pprint import pprint


# Structure payload.
payload = {
    'source': 'amazon_search',
    'domain': 'nl',
    'query': 'adidas',
    'start_page': 11,
    'pages': 10,
    'parse': True,
    'context': [
        {'key': 'category_id', 'value': 16391843031},
        {'key': 'merchant_id', 'value':'3AA17D2BRD4YMT0X'}
    ],
}


# Get response.
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('user', 'pass1'),
    json=payload,
)

# Print prettified response to stdout.
pprint(response.json())

To see the response example with retrieved data, download this sample output file in JSON format.

Product

The amazon_product data source is designed to retrieve Amazon product pages.

Query parameters

Parameter Description Default Value
source Data source. More info. amazon_product
domain Domain localization for Amazon. The full list of available domains can be found here. com
query 10-symbol ASIN code -
geo_location The Deliver to location. See our guide to using this parameter here. -
user_agent_type Device type and browser. The full list can be found here. desktop
render Enables JavaScript rendering. More info.
callback_url URL to your callback endpoint. More info. -
parse true will return structured data. -

context:
autoselect_variant

To get accurate pricing/buybox data, set this parameter to true (which tells us to append the th=1&psc=1 URL parameters to the end of the product URL). To get an accurate representation of the parent ASIN's product page, omit this parameter or set it to false. false

- required parameter

Python code example

In the code example below, we make a request to retrieve product page for ASIN B09RX4KS1Gon amazon.nl marketplace. In case the ASIN provided is a parent ASIN, we ask Amazon to return a product page of an automatically-selected variation.

import requests
from pprint import pprint


# Structure payload.
payload = {
    'source': 'amazon_product',
    'domain': 'nl',
    'query': 'B09RX4KS1G',
    'parse': True,
    'context': [
    {
      'key': 'autoselect_variant', 'value': True
    }],
}


# Get response.
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('user', 'pass1'),
    json=payload,
)

# Print prettified response to stdout.
pprint(response.json())

To see the response example with retrieved data, download this sample output file in JSON format.

Offer listing

The amazon_pricing data source is designed to retrieve Amazon product offer listings.

Query parameters

Parameter Description Default Value
source Data source. More info. amazon_pricing
domain Domain localization for Amazon. The full list of available domains can be found here. com
query 10-symbol ASIN code -
start_page Starting page number 1
pages Number of pages to retrieve 1
geo_location The Deliver to location. See our guide to using this parameter here. -
user_agent_type Device type and browser. The full list can be found here. desktop
render Enables JavaScript rendering. More info.
callback_url URL to your callback endpoint. More info. -
parse true will return structured data. -

- required parameter

Python code example

In the code examples below, we make a request to retrieve product offer listing page for ASIN B09RX4KS1G on amazon.nl marketplace.

import requests
from pprint import pprint


# Structure payload.
payload = {
    'source': 'amazon_pricing',
    'domain': 'nl',
    'query': 'B09RX4KS1G',
    'parse': True,
}


# Get response.
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('user', 'pass1'),
    json=payload,
)

# Print prettified response to stdout.
pprint(response.json())

To see what the parsed output looks like, download this JSON file.

Reviews

The amazon_reviews data source is designed to retrieve Amazon product review pages of an ASIN of your choice.

Query parameters

Parameter Description Default Value
source Data source. More info. amazon_reviews
domain Domain localization for Amazon. The full list of available domains can be found here. com
query 10-symbol ASIN code -
geo_location The Deliver to location. See our guide to using this parameter here. -
user_agent_type Device type and browser. The full list can be found here. desktop
start_page Starting page number 1
pages Number of pages to retrieve 1
render Enables JavaScript rendering. More info.
callback_url URL to your callback endpoint. More info. -
parse true will return structured data. -

- required parameter

Python code example

import requests
from pprint import pprint


# Structure payload.
payload = {
    'source': 'amazon_reviews',
    'domain': 'nl',
    'query': 'B09RX4KS1G',
    'parse': True,
}


# Get response.
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('user', 'pass1'),
    json=payload,
)

# Print prettified response to stdout.
pprint(response.json())

To see the response example with retrieved data, download this sample output file in JSON format.

Questions & Answers

The amazon_questions data source is designed to retrieve any particular product's Questions & Answers pages.

Query parameters

Parameter Description Default Value
source Data source. More info. amazon_questions
domain Domain localization for Amazon. The full list of available domains can be found here. com
query 10-symbol ASIN code -
geo_location The Deliver to location. See our guide to using this parameter here. -
user_agent_type Device type and browser. The full list can be found here. desktop
render Enables JavaScript rendering. More info.****
callback_url URL to your callback endpoint. More info. -
parse true will return structured data. -

- required parameter

Python code example

import requests
from pprint import pprint


# Structure payload.
payload = {
    'source': 'amazon_questions',
    'domain': 'nl',
    'query': 'B09RX4KS1G',
    'parse': True,
}


# Get response.
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('user', 'pass1'),
    json=payload,
)

# Print prettified response to stdout.
pprint(response.json())

To see the response example with retrieved data, download this sample output file in JSON format.

Best Sellers

The amazon_bestsellers data source is designed to retrieve Amazon Best Sellers pages.

Query parameters

Parameter Description Default Value
source Data source. More info. amazon_bestsellers
domain Domain localization for Amazon. The full list of available domains can be found here. com
query Department name. Example: Clothing, Shoes & Jewelry -
start_page Starting page number 1
pages Number of pages to retrieve 1
geo_location The Deliver to location. See our guide to using this parameter here. -
user_agent_type Device type and browser. The full list can be found here. desktop
render Enables JavaScript rendering. More info.
callback_url URL to your callback endpoint. More info. -
parse true will return structured data. -

context:
category_id

Search for items in a particular browse node (product category). -

- required parameter

Python code example

import requests
from pprint import pprint


# Structure payload.
payload = {
    'source': 'amazon_bestsellers',
    'domain': 'de',
    'query': 'automotive',
    'start_page': 2,
    'parse': True,
    'context': [
        {'key': 'category_id', 'value': 82400031},
    ],
}


# Get response.
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('user', 'pass1'),
    json=payload,
)

# Print prettified response to stdout.
pprint(response.json())

To see the response example with retrieved data, download this sample output file in JSON format.

Sellers

The amazon_sellers data source is designed to retrieve Amazon Sellers pages.

Query parameters

Parameter Description Default Value
source Data source. More info. amazon_sellers
domain Domain localization for Amazon. The full list of available domains can be found here. com
query 13-character seller ID -
geo_location The Deliver to location. See our guide to using this parameter here. -
user_agent_type Device type and browser. The full list can be found here. desktop
render Enables JavaScript rendering. More info.
callback_url URL to your callback endpoint. More info. -
parse true will return structured data. Please note that right now we only support parsed output for desktop device type. However, there is no apparent reason to get sellers pages with any other device type, as seller data is going to be exactly the same across all devices. -

- required parameter

Python code example

In the code examples below, we make a request to retrieve the seller page for seller ID ABNP0A7Y0QWBN on amazon.de marketplace.

import requests
from pprint import pprint


# Structure payload.
payload = {
    'source': 'amazon_sellers',
    'domain': 'de',
    'query': 'ABNP0A7Y0QWBN',
    'parse': True
}


# Get response.
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('user', 'pass1'),
    json=payload,
)

# Print prettified response to stdout.
pprint(response.json())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

amazon-scraper-api-0.1.0.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

amazon_scraper_api-0.1.0-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file amazon-scraper-api-0.1.0.tar.gz.

File metadata

  • Download URL: amazon-scraper-api-0.1.0.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for amazon-scraper-api-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d908fdb11dd5f60e07ef83a7bc33549281fdb5c50b1085cdc3d520a150be901f
MD5 3b4a5edca48a11af1c7835de582e5734
BLAKE2b-256 60c7f9bca45689c0aa423ea7ccc0c1f76984932f2d9bb4cbb21748f1e60439c7

See more details on using hashes here.

File details

Details for the file amazon_scraper_api-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for amazon_scraper_api-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 48cbff12631e5e2090e49538320c56ae644619366a8c446ddb20e3ee24552d92
MD5 d7acfb4ccfe2f9cd9b2f1d7dded2ace1
BLAKE2b-256 ae06f26f919d290850ca36572f837ed88e5f8eabbc11b20b59d2bd38e3953ec7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page