ScraperAPI Python SDK
Project description
ScraperAPI Python SDK
Install
pip install scraperapi-sdk
Usage
from scraperapi_sdk import ScraperAPIClient
client = ScraperAPIClient("<API-KEY>")
# regular get request
content = client.get('https://amazon.com/')
# get request with premium
content = client.get('https://amazon.com/', params={'premium': True})
# post request
content = client.post('https://webhook.site/403e44ce-5835-4ce9-a648-188a51d9395d', headers={'Content-Type': 'application/x-www-form-urlencoded'}, data={'field1': 'data1'})
# put request
content = client.put('https://webhook.site/403e44ce-5835-4ce9-a648-188a51d9395d', headers={'Content-Type': 'application/json'}, data={'field1': 'data1'})
The content variable will contain the scraped page.
If you want to get the Response object instead of the content you can use make_request.
response = client.make_request(url='https://webhook.site/403e44ce-5835-4ce9-a648-188a51d9395d', headers={'Content-Type': 'application/json'}, data={'field1': 'data1'})
# response will be <Response [200]>
Exception
from scraperapi_sdk import ScraperAPIClient
from scraperapi_sdk.exceptions import ScraperAPIException
client = ScraperAPIClient(
api_key=api_key,
)
try:
result = client.post('https://webhook.site/403e44ce-5835-4ce9-a648-188a51d9395d', headers={'Content-Type': 'application/x-www-form-urlencoded'}, data={'field1': 'data1'})
_ = result
except ScraperAPIException as e:
print(e.original_exception) # you can access the original exception via `.original_exception` property.
scrapyGet
To prepare a URL for scrapy you can use client.scrapyGet method.
client.scrapyGet(url,
headers={"header1": "value1"},
country_code="us",
premium=False,
render=True,
session_number=2772728518147,
autoparse=None,
)
All of the parameters except url are optional.
Full example:
import scrapy
import os
from pathlib import Path
from scraperapi_sdk import ScraperAPIClient
client = ScraperAPIClient(
api_key=os.getenv("SCRAPERAPI_API_KEY"),
)
class ExampleSpider(scrapy.Spider):
name = ""
async def start(self):
urls = [
"https://example.com/",
]
for url in urls:
yield scrapy.Request(url=client.scrapyGet(url, render=True, ), callback=self.parse)
def parse(self, response):
page = response.url.split("/")[-2]
filename = f"example-{page}.html"
Path(filename).write_bytes(response.body)
self.log(f"Saved file {filename}")
Structured Data Collection Methods
Amazon Endpoints
Amazon Product Page API
This method will retrieve product data from an Amazon product page and transform it into usable JSON.
result = client.amazon.product("<ASIN>")
result = client.amazon.product("<ASIN>", country="us", tld="com")
Read more in docs: Amazon Product Page API
Amazon Search API
This method will retrieve products for a specified search term from Amazon search page and transform it into usable JSON.
result = client.amazon.search("<QUERY>")
result = client.amazon.search("<QUERY>", country="us", tld="com")
Read more in docs: Amazon Search API
Amazon Offers API
This method will retrieve offers for a specified product from an Amazon offers page and transform it into usable JSON.
result = client.amazon.offers("<ASIN>")
result = client.amazon.offers("<ASIN>", country="us", tld="com")
Read more in docs: Amazon Offers API
Amazon Prices API
This method will retrieve product prices for the given ASINs and transform it into usable JSON.
result = client.amazon.prices(['<ASIN1>'])
result = client.amazon.prices(['<ASIN1>', '<ASIN2>'])
result = client.amazon.prices(['<ASIN1>', '<ASIN2>'], country="us", tld="com")
Read more in docs: Amazon Prices API
Ebay API
Ebay Search API
This endpoint will retrieve products for a specified search term from Ebay search page and transform it into usable JSON.
result = client.ebay.search("<QUERY>")
result = client.ebay.search2("<QUERY>", country="us", tld="com") // newest version with additional data
Read more in docs: Ebay Search API
Ebay Product API
This endpoint will retrieve product data from an Ebay product pages (/itm/) and transform it into usable JSON.
result = client.ebay.product("<PRODUCT_ID>")
result = client.ebay.product("<PRODUCT_ID>", country="us", tld="com")
Read more in docs: Ebay Product API
Google API
Google SERP API
This method will retrieve product data from an Google search result page and transform it into usable JSON.
result = client.google.search('free hosting')
result = client.google.search('free hosting', country="us", tld="com")
Read more in docs: Google SERP API
Google News API
This method will retrieve news data from an Google news result page and transform it into usable JSON.
result = client.google.news('tornado')
result = client.google.news('tornado', country="us", tld="com")
Read more in docs: Google News API
Google Jobs API
This method will retrieve jobs data from an Google jobs result page and transform it into usable JSON.
result = client.google.jobs('Senior Software Developer')
result = client.google.jobs('Senior Software Developer', country="us", tld="com")
Read more in docs: Google Jobs API
Google Shopping API
This method will retrieve shopping data from an Google shopping result page and transform it into usable JSON.
result = client.google.shopping('macbook air')
result = client.google.shopping('macbook air', country="us", tld="com")
Read more in docs: Google Shopping API
Redfin API
Redfin Agent Details API
This endpoint retrieves information and details from a Redfin Agent's page or a Redfin Partner Agents page and transforms it into usable JSON.
result = client.redfin.agent("<URL>")
result = client.redfin.agent("<URL>", country="us", tld="com")
Read more in docs: Redfin Agent Details API
Redfin 'For Rent' Listings API
This endpoint will retrieve listing information from a single 'For Rent' property listing page and transform it into usable JSON.
result = client.redfin.forrent("<URL>")
result = client.redfin.forrent("<URL>", country="us", tld="com")
Read more in docs: Redfin For Rent Listings API
Redfin 'For Sale' Listings API
This endpoint will retrieve listing information from a single 'For Sale' property listing page and transform it into usable JSON.
result = client.redfin.forsale("<URL>")
result = client.redfin.forsale("<URL>", country="us", tld="com")
Read more in docs: Redfin For Sale Listings API
Redfin Listing Search API
This endpoint will return the search results from a listing search page and transform it into usable JSON.
result = client.redfin.search("<URL>")
result = client.redfin.search("<URL>", country="us", tld="com")
Read more in docs: Redfin Listing Search API
Walmart API
Walmart Search API
This method will retrieve product list data from Walmart as a result of a search.
result = client.walmart.search('hoodie')
result = client.walmart.search('hoodie', page=2)
Read more in docs: Walmart Search API
Walmart Category API
This method will retrieve Walmart product list for a specified product category.
result = client.walmart.category('5438_7712430_8775031_5315201_3279226')
result = client.walmart.category('5438_7712430_8775031_5315201_3279226', page=2)
Read more in docs: Walmart Category API
Walmart Product API
This method will retrieve Walmart product details for one product.
result = client.walmart.product('5053452213')
Read more in docs: Walmart Product API
Async Scraping
Basic scraping:
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient(api_key)
request_id = None
# request async scraping
try:
job = client.create('https://example.com')
request_id = job.get('id')
except ScraperAPIException as e:
print(e.original_exception)
# if job was submitted successfully we can request the result of scraping
if request_id:
result = client.get(request_id)
Read more in docs: How to use Async Scraping
Webhook Callback
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient(api_key)
request_id = None
# request async scraping
try:
job = client.create('https://example.com', webhook_url="https://webhook.site/#!/view/c4facc6e-c028-4d9c-9f58-b14c92a381fe")
request_id = job.get('id')
except ScraperAPIException as e:
print(e.original_exception)
# if job was submitted successfully we can request the result of scraping
if request_id:
result = client.get(request_id)
Wait for results
You can use wait method which will poll ScraperAPI for result until its ready.
Use client.wait
Arguments:
request_id (required): ID returned from client.create call
cooldown (optional, default=5): number of seconds between retries
max_retries (optional, default=10): Maximum number of retries
raise_for_exceeding_max_retries (optional, default=False): If True will raise exception when reached max_retries, else returns the response from the API.
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient(api_key)
request_id = None
# request async scraping
try:
job = client.create('https://example.com')
request_id = job.get('id')
except ScraperAPIException as e:
print(e.original_exception)
# if job was submitted successfully we can request the result of scraping
if request_id:
result = client.wait(
request_id,
cooldown=5,
max_retries=10,
raise_for_exceeding_max_retries=False,
)
Amazon Async Scraping
Amazon Product
Scrape a single Amazon Product asynchronously:
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
request_id = None
try:
job = client.amazon.product('B0CHVR5K7C')
request_id = job.get('id')
except ScraperAPIException as e:
print(e.original_exception)
result = client.get(request_id)
Single Product with params:
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
request_id = None
try:
job = client.amazon.product('B0B5PLT7FZ', api_params=dict(country_code='uk', tld='co.uk'))
request_id = job.get('id')
except ScraperAPIException as e:
print(e.original_exception)
result = client.get(request_id)
Scrape multiple Amazon Products asynchronously with params:
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
request_id = None
try:
job = client.amazon.products(['B0B5PLT7FZ', 'B00CL6353A'], api_params=dict(country_code='uk', tld='co.uk'))
request_id = job.get('id')
except ScraperAPIException as e:
print(e.original_exception)
result = client.get(request_id)
Read more in docs: Async Amazon Product Scraping
Amazon Search
Search Amazon asynchronously
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
request_id = None
try:
job = client.amazon.search('usb c microphone')
request_id = job.get('id')
except ScraperAPIException as e:
print(e.original_exception)
result = client.get(request_id)
Search Amazon asynchronously with api_params
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
request_id = None
try:
job = client.amazon.search('usb c microphone', api_params=dict(country_code='uk', tld='co.uk')
request_id = job.get('id')
except ScraperAPIException as e:
print(e.original_exception)
result = client.get(request_id)
Read more in docs: Amazon Review Scraping Async
Amazon Offers for a Product
Scrape Amazon offers for a single product
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
request_id = None
try:
job = client.amazon.offers('B0CHVR5K7C')
request_id = job.get('id')
except ScraperAPIException as e:
print(e.original_exception)
result = client.get(request_id)
Scrape Amazon offers for multiple products
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
request_id = None
try:
jobs = client.amazon.offers('B0CHVR5K7C')
except ScraperAPIException as e:
print(e.original_exception)
for job in jobs:
result = client.get(job.get('id'))
Amazon Reviews
Scrape Reviews for a single product asynchronously:
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
request_id = None
try:
job = client.amazon.product('B0B5PLT7FZ'], api_params=dict(country_code='uk', tld='co.uk'))
request_id = job.get('id')
except ScraperAPIException as e:
print(e.original_exception)
result = client.get(request_id)
Scrape reviews for multiple products asynchronously:
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
try:
jobs = client.amazon.products(['B0B5PLT7FZ', 'B00CL6353A'], api_params=dict(country_code='uk', tld='co.uk'))
except ScraperAPIException as e:
print(e.original_exception)
for job in jobs:
result = client.get(job.get('id'))
Read more in docs: Amazon Review Scraping Async
Google Async Scraping
Google Async Search Scraping
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
try:
jobs = client.google.search('solar eclipse')
except ScraperAPIException as e:
print(e.original_exception)
for job in jobs:
result = client.get(job.get('id'))
Read more in docs: Google Search API (Async)
Google Async News Scraping
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
try:
jobs = client.google.news('solar eclipse')
except ScraperAPIException as e:
print(e.original_exception)
for job in jobs:
result = client.get(job.get('id'))
Read more in docs: Google News API (Async)
Google Async Jobs Scraping
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
try:
jobs = client.google.jobs('senior software developer')
except ScraperAPIException as e:
print(e.original_exception)
for job in jobs:
result = client.get(job.get('id'))
Read more in docs: Google Jobs API (Async)
Google Async Shopping Scraping
from scraperapi_sdk import ScraperAPIAsyncClient, ScraperAPIException
client = ScraperAPIAsyncClient('<api_key>')
try:
jobs = client.google.shopping('usb c microphone')
except ScraperAPIException as e:
print(e.original_exception)
for job in jobs:
result = client.get(job.get('id'))
Read more in docs: Google Shopping API (Async)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scraperapi_sdk-1.6.0.tar.gz.
File metadata
- Download URL: scraperapi_sdk-1.6.0.tar.gz
- Upload date:
- Size: 9.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.13.5 Darwin/25.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffc55b78f64056968060ebcbc87420aac895720127dac943bd4c05eb3be8dd45
|
|
| MD5 |
500003f911cd480d84f2d1ca4bcbd5b2
|
|
| BLAKE2b-256 |
2e3d1d96df2b260a29f51cba385818086b1d8d09d68c27a8053354ed1d80bf85
|
File details
Details for the file scraperapi_sdk-1.6.0-py3-none-any.whl.
File metadata
- Download URL: scraperapi_sdk-1.6.0-py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.13.5 Darwin/25.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03346570ad70844247ec903f0224d6b28e907088839ce4aca8a5ac60e7551bfd
|
|
| MD5 |
5ed204b4a201cb9c18749e69da74727d
|
|
| BLAKE2b-256 |
79e70be7363384f6fa443c4cf8338876f019883b86c086ba20b72bcf30d22a41
|