Skip to main content

Marketing toolset using Pandas & Google Sheets API, with classes for a variety of other Google APIs

Project description

Google Sheets with Pandas dataframes, useful when ‘prospecting’ in analytics work and hacking.


Setup

  1. Create or select a project in Google’s developer console

    • Also, you will need to enable the APIs you plan to use

  2. Get a client_secrets.json credentials file from the credentials section

  3. Load the prospecting module in a Python session to initialize the ~/.prospecting/ folder in your home directory

  4. Place the client_secrets.json file in the ~/.prospecting/credentials/ directory

  5. Load an API class in a Python session, then run apiclass.authenticate() and follow steps

    • You only need to setup authentication once per API unless creds change

Examples:

import prospecting as p

Use stats sheet to store stats and misc statistics (scopelist defaults to read-only, so pass scopes for writing):

ss_stats = p.SheetsApi(spreadsheetid = 'PASTE_GOOGLE_SHEETID_HERE',
                       scopelist=['https://www.googleapis.com/auth/spreadsheets',
                                  'https://www.googleapis.com/auth/drive.metadata'])
ss_stats.authenticate()
ss_stats.update('Sheet1', somedataframe)

Use a reference sheet to provide a named entity list (or stopwords, vocabulary) for NLP preprocessing:

ss_reference = p.SheetsApi(spreadsheetid = 'PASTE_GOOGLE_SHEETID_HERE',
                           scopelist=['https://www.googleapis.com/auth/spreadsheets',
                                      'https://www.googleapis.com/auth/drive.metadata'])
ss_reference.authenticate()
named_entity_list = list(ss_reference.get('ne!A:B').iloc[:,0].values)

Get keywords sheet as dataframe, filter, take sampled subset, upload new df to other tab in spreadsheet:

ss_kw = p.SheetsApi(spreadsheetid = 'PASTE_GOOGLE_SHEETID_HERE',
                    scopelist=['https://www.googleapis.com/auth/spreadsheets',
                               'https://www.googleapis.com/auth/drive.metadata'])
ss_kw.authenticate()

#  Get data using spreadsheet syntax like ('sheetname') or ('sheetname!A:B25')
df_query = ss_kw.get('queries')
df_query_subset = df_query[(df_query['raw_len'] > 1) &
                           (df_query['reject'] != 1)]

#  Take a subsample of data
df_query_subset_sample = df_query_subset.sample(frac=0.5)
df_query_subset_sample.reset_index(drop=True, inplace=True)

#  Update 'sheetname' with dataframe object
ss_kw.update('queries_shuffled', df_query_subset_sample)

Key changes between 0.1.4 and 0.1.2:

  • Switched order of input arguments for ss.update() function:

    From
       ss.update(dataframe, 'sheetname')
    To
       ss.update('sheetname', dataframe)
  • Removed Docker files to simplify



hammer_and_pick hammer_and_pick hammer_and_pick

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

prospecting-0.1.5-py3-none-any.whl (19.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page