Skip to main content

Scrape application data from the Google Play store.

Project description

google-play-scraper-py

Python wrapper around the Node.js module to scrape application data from the Google Play store. Please refer to the module's page for the most up-to-date information about the API.

Any questions, suggestions, or issues with the underlying Node.js functionality should be directed to facundoolano, who is the developer of the original module.

Installation

First, ensure that you have the latest version of Node.js installed on your machine.

Then, install the library:

pip install google-play-scraper-py

This will also include the required Node dependencies.

Example

import scraper

result = scraper.app(appId='com.google.android.apps.translate')
print(result)

Results:

{
  'adSupported': False,
  'androidVersion': 'VARY',
  'androidVersionText': 'Varies with device',
  'appId': 'com.google.android.apps.translate',
  'comments': [...],
  'contentRating': 'Everyone',
  'currency': 'USD',
  'description': ...,
  'descriptionHTML': ...,
  'developer': 'Google LLC',
  'developerAddress': '1600 Amphitheatre Parkway, Mountain View 94043',
  'developerEmail': 'translate-mobile-support@google.com',
  'developerId': '5700313618786177705',
  'developerInternalID': '5700313618786177705',
  'developerWebsite': 'http://support.google.com/translate',
  'editorsChoice': False,
  'free': True,
  'genre': 'Tools',
  'genreId': 'TOOLS',
  'headerImage': 'https://play-lh.googleusercontent.com/e4Sfy0cOmqpike76V6N6n-tDVbtbmt6MxbnbkKBZ_7hPHZRfsCeZhMBZK8eFDoDa1Vf-',
  'histogram': {'1': 458414,
               '2': 158469,
               '3': 420198,
               '4': 941431,
               '5': 5851958},
  'icon': 'https://play-lh.googleusercontent.com/ZrNeuKthBirZN7rrXPN1JmUbaG8ICy3kZSHt-WgSnREsJzo2txzCzjIoChlevMIQEA',
  'installs': '1,000,000,000+',
  'maxInstalls': 1004629541,
  'minInstalls': 1000000000,
  'offersIAP': False,
  'price': 0,
  'priceText': 'Free',
  'privacyPolicy': 'http://www.google.com/policies/privacy/',
  'ratings': 7830472,
  'recentChanges': 'Bug fixes and improvements',
  'reviews': 1924772,
  'score': 4.477567,
  'scoreText': '4.5',
  'screenshots': [...],
  'size': 'Varies with device',
  'summary': 'The world is closer than ever with over 100 languages',
  'title': 'Google Translate',
  'updated': 1616099487000,
  'url': 'https://play.google.com/store/apps/details?id=com.google.android.apps.translate&hl=en&gl=us',
  'version': 'Varies with device'
}

Usage

Available methods:

  • app: Retrieves the full detail of an application.
  • list: Retrieves a list of applications from one of the collections at Google Play.
  • search: Retrieves a list of apps that results of searching by the given term.
  • developer: Returns the list of applications by the given developer name.
  • suggest: Given a string returns up to five suggestion to complete a search query term.
  • reviews: Retrieves a page of reviews for a specific application.
  • similar: Returns a list of similar apps to the one specified.
  • permissions: Returns the list of permissions an app has access to.
  • categories: Retrieve a full list of categories present from dropdown menu on Google Play.

app

Retrieves the full detail of an application. Options:

  • appId: the Google Play id of the application (the ?id= parameter on the url).
  • lang (optional, defaults to 'en'): the two letter language code in which to fetch the app page.
  • country (optional, defaults to 'us'): the two letter country code used to retrieve the applications. Needed when the app is available only in some countries.

list

Retrieve a list of applications from one of the collections at Google Play. Options:

  • collection (optional, defaults to collection.TOP_FREE): the Google Play collection that will be retrieved. Available options can bee found here.
  • category (optional, defaults to no category): the app category to filter by. Available options can bee found here.
  • age (optional, defaults to no age filter): the age range to filter the apps (only for FAMILY and its subcategories). Available options are age.FIVE_UNDER, age.SIX_EIGHT, age.NINE_UP.
  • num (optional, defaults to 500): the amount of apps to retrieve.
  • lang (optional, defaults to 'en'): the two letter language code used to retrieve the applications.
  • country (optional, defaults to 'us'): the two letter country code used to retrieve the applications.
  • fullDetail (optional, defaults to false): if true, an extra request will be made for every resulting app to fetch its full detail.

search

Retrieves a list of apps that results of searching by the given term. Options:

  • term: the term to search by.
  • num (optional, defaults to 20, max is 250): the amount of apps to retrieve.
  • lang (optional, defaults to 'en'): the two letter language code used to retrieve the applications.
  • country (optional, defaults to 'us'): the two letter country code used to retrieve the applications.
  • fullDetail (optional, defaults to false): if true, an extra request will be made for every resulting app to fetch its full detail.
  • price (optional, defaults to all): allows to control if the results apps are free, paid or both.
    • all: Free and paid
    • free: Free apps only
    • paid: Paid apps only

developer

Returns the list of applications by the given developer name. Options:

  • devId: the name of the developer.
  • lang (optional, defaults to 'en'): the two letter language code in which to fetch the app list.
  • country (optional, defaults to 'us'): the two letter country code used to retrieve the applications. Needed when the app is available only in some countries.
  • num (optional, defaults to 60): the amount of apps to retrieve.
  • fullDetail (optional, defaults to false): if true, an extra request will be made for every resulting app to fetch its full detail.

suggest

Given a string returns up to five suggestion to complete a search query term. Options:

  • term: the term to get suggestions for.
  • lang (optional, defaults to 'en'): the two letter language code used to retrieve the suggestions.
  • country (optional, defaults to 'us'): the two letter country code used to retrieve the suggestions.

reviews

Retrieves a page of reviews for a specific application.

Note that this method returns reviews in a specific language (english by default), so you need to try different languages to get more reviews. Also, the counter displayed in the Google Play page refers to the total number of 1-5 stars ratings the application has, not the written reviews count. So if the app has 100k ratings, don't expect to get 100k reviews by using this method.

You can get all reviews at once, by sending the num parameter (i.g. 5000), or paginated reviews (with 150 per page), by setting the pagination parameter to true;

You'll have to choose wich method is better for your use case.

By setting num + paginate, the num parameter will be ignored and you will receive a paginated response instead.

Options:

  • appId: Unique application id for Google Play. (e.g. id=com.mojang.minecraftpe maps to Minecraft: Pocket Edition game).
  • lang (optional, defaults to 'en'): the two letter language code in which to fetch the reviews.
  • country (optional, defaults to 'us'): the two letter country code in which to fetch the reviews.
  • sort (optional, defaults to sort.NEWEST): The way the reviews are going to be sorted. Accepted values are: sort.NEWEST, sort.RATING and sort.HELPFULNESS.
  • num (optional, defaults to 100): Quantity of reviews to be captured.
  • paginate (optional, defaults to false): Defines if the result will be paginated
  • nextPaginationToken (optional, defaults to null): The next token to paginate

similar

Returns a list of similar apps to the one specified. Options:

  • appId: the Google Play id of the application to get similar apps for.
  • lang (optional, defaults to 'en'): the two letter language code used to retrieve the applications.
  • country (optional, defaults to 'us'): the two letter country code used to retrieve the applications.
  • fullDetail (optional, defaults to false): if true, an extra request will be made for every resulting app to fetch its full detail.

permissions

Returns the list of permissions an app has access to.

  • appId: the Google Play id of the application to get permissions for.
  • lang (optional, defaults to 'en'): the two letter language code in which to fetch the permissions.
  • short (optional, defaults to false): if true, the permission names will be returned instead of permission/description objects.

categories

Retrieve a full list of categories present from dropdown menu on Google Play.

  • this method has no options

Throttling

All methods on the scraper have to access the Google Play server in one form or another. When making too many requests in a short period of time (specially when using the fullDetail option), is common to hit Google Play's throttling limit. That means requests start getting status 503 responses with a captcha to verify if the requesting entity is a human (which is not :P). In those cases the requesting IP can be banned from making further requests for a while (usually around an hour).

To avoid this situation, all methods now support a throttle property, which defines an upper bound to the amount of requests that will be attempted per second. Once that limit is reached, further requests will be held until the second passes.

import scraper

scraper.search(term='panda', throttle=10)

By default, no throttling is applied.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

google-play-scraper-py-0.2.3.tar.gz (5.1 MB view details)

Uploaded Source

Built Distribution

google_play_scraper_py-0.2.3-py3-none-any.whl (8.0 MB view details)

Uploaded Python 3

File details

Details for the file google-play-scraper-py-0.2.3.tar.gz.

File metadata

  • Download URL: google-play-scraper-py-0.2.3.tar.gz
  • Upload date:
  • Size: 5.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.8.1 keyring/23.1.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for google-play-scraper-py-0.2.3.tar.gz
Algorithm Hash digest
SHA256 b2c0528820f699dac5272a71703ae2d8947bfdf956dc9c675ba16b5c8a80ce66
MD5 04dd4199c73e9b41101c50aaa290ea76
BLAKE2b-256 7ce44d0358c2fff92cedbb947bdd906b582635554b876d671d28a8478efef081

See more details on using hashes here.

File details

Details for the file google_play_scraper_py-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: google_play_scraper_py-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 8.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.8.1 keyring/23.1.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for google_play_scraper_py-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1cd278f1077a5fb15aad27d08cb430a603681a7620dda8891d8aa8056822d5fd
MD5 eb08b62a0f56ece2598a1ef944907335
BLAKE2b-256 6533c8aafb696f7e31b9c51f4eccc47569cd1e17fca6dad8382748bb6c4df0f2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page