Skip to main content

Python API to DATASHAKE reviews

Project description

datashakereviewsapi: python API to DATASHAKE reviews

Python API to DATASHAKE reviews (https://www.datashake.com/review-scraper-api) This module makes it easier to schedule jobs and fetch the results Official web API documentation: https://api.datashake.com/#reviews You need to have datashake API key to use this module

Installation

Through cloning this repositary only. [at the moment]

Usage examples

Initiate API instance

from datashakereviewsapi.datashakereviewsapi import DatashakeReviewAPI

# Initiate API instance with your API key from DATASHAKE
api = DatashakeReviewAPI('your_datashake_reviews_scraper_api_key')

Schedule a single job with a URL to review page. DATASHALE API takes several hours to crawl the page and collect the results.

response = api.schedule_job('https://uk.trustpilot.com/review/store.playstation.com')
# save job_id for querying the results later
first_job_id = response['job_id']

Get the job results - reviews

reviews = api.get_job_reviews(first_job_id)

Schedule another job with a reference to the first one - get delta (new reviews) only

response2 = api.schedule_job('https://uk.trustpilot.com/review/store.playstation.com',
                              previous_job_id=first_job_id)

Create a job list (one row in the example) and schedule jobs for all the urls from the list

jobs_list = pd.DataFrame(columns=['Website', 'url', 'latest_job_id', 'status', 'last_crawl',
       'latest_schedule_message'])
jobs_list['url'] = ['https://uk.trustpilot.com/review/store.playstation.com']
updated_job_list = api.schedule_job_list(jobs_list)

And ultimately - fetch the reviews and save them to a csv file, reschedule all jobs in the jobs list

# Plug-n-Play block to schedule/update jobs and get/save results
# The prerequisite for running the snippet is existence of two CSV files with the following structure:
# jobs_list.csv columns: ['Website', 'url', 'latest_job_id', 'status', 'last_crawl', 'latest_schedule_message']
# reviews_list.csv columns: ['job_id', 'source_name', 'id', 'name', 'date', 'rating_value',
#                           'review_text', 'url', 'profile_picture', 'location', 'review_title',
#                           'verified_order', 'reviewer_title', 'language_code', 'meta_data']


# Code block refresh review jobs and review results
jobs_list_filepath = 'job_list.csv'
reviews_list_filepath = 'reviews_list.csv'

df_jobs = pd.read_csv(jobs_list_filepath, index_col='id')
df_reviews = pd.read_csv(reviews_list_filepath, index_col='unique_id')

df_jobs_new, df_reviews_new = api.get_job_list_reviews(df_jobs, df_reviews)

df_jobs_new.to_csv(jobs_list_filepath, encoding='utf-8-sig')
df_reviews_new.to_csv(reviews_list_filepath, encoding='utf-8-sig')


# Codes block to reschedule review jobs
df_jobs = pd.read_csv(jobs_list_filepath, index_col='id')
df_jobs_new = api.schedule_job_list(df_jobs)
df_jobs_new.to_csv(jobs_list_filepath, encoding='utf-8-sig')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datashakereviewsapi-1.1.tar.gz (6.9 kB view details)

Uploaded Source

File details

Details for the file datashakereviewsapi-1.1.tar.gz.

File metadata

  • Download URL: datashakereviewsapi-1.1.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.4

File hashes

Hashes for datashakereviewsapi-1.1.tar.gz
Algorithm Hash digest
SHA256 efec620c9663c70ef1ac0239f834549df7bb6d6ce72dad9593ff0e65c0e6ceca
MD5 1fc8946621c41dd870d4a89ff11d69dc
BLAKE2b-256 a2edf1d15f2cca6d3b703a186fe75a1c9ca3c63b8a74e8f4da7de8a1dc43e8d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page