Skip to main content

Marketing toolset using Pandas & Google Sheets API, with classes for a variety of other Google APIs

Project description

Google Sheets with Pandas dataframes, useful when ‘prospecting’ in analytics work and hacking.


Setup

  1. Create or select a project in Google’s developer console
    • Also, you will need to enable the APIs you plan to use
  2. Get a client_secrets.json credentials file from the credentials section
  3. Load the prospecting module in a Python session to initialize the ~/.prospecting/ folder in your home directory
  4. Place the client_secrets.json file in the ~/.prospecting/credentials/ directory
  5. Load an API class in a Python session, then run apiclass.authenticate() and follow steps
    • You only need to setup authentication once per API unless creds change

Examples:

import prospecting as p

Use stats sheet to store stats and misc statistics (scopelist defaults to read-only, so pass scopes for writing):

ss_stats = p.SheetsApi(spreadsheetid = 'PASTE_GOOGLE_SHEETID_HERE',
                       scopelist=['https://www.googleapis.com/auth/spreadsheets',
                                  'https://www.googleapis.com/auth/drive.metadata'])
ss_stats.authenticate()
ss_stats.update('Sheet1', somedataframe)

Use a reference sheet to provide a named entity list (or stopwords, vocabulary) for NLP preprocessing:

ss_reference = p.SheetsApi(spreadsheetid = 'PASTE_GOOGLE_SHEETID_HERE',
                           scopelist=['https://www.googleapis.com/auth/spreadsheets',
                                      'https://www.googleapis.com/auth/drive.metadata'])
ss_reference.authenticate()
named_entity_list = list(ss_reference.get('ne!A:B').iloc[:,0].values)

Get keywords sheet as dataframe, filter, take sampled subset, upload new df to other tab in spreadsheet:

ss_kw = p.SheetsApi(spreadsheetid = 'PASTE_GOOGLE_SHEETID_HERE',
                    scopelist=['https://www.googleapis.com/auth/spreadsheets',
                               'https://www.googleapis.com/auth/drive.metadata'])
ss_kw.authenticate()

#  Get data using spreadsheet syntax like ('sheetname') or ('sheetname!A:B25')
df_query = ss_kw.get('queries')
df_query_subset = df_query[(df_query['raw_len'] > 1) &
                           (df_query['reject'] != 1)]

#  Take a subsample of data
df_query_subset_sample = df_query_subset.sample(frac=0.5)
df_query_subset_sample.reset_index(drop=True, inplace=True)

#  Update 'sheetname' with dataframe object
ss_kw.update('queries_shuffled', df_query_subset_sample)

Key changes between 0.1.4 and 0.1.2:

  • Switched order of input arguments for ss.update() function:

    From
       ss.update(dataframe, 'sheetname')
    To
       ss.update('sheetname', dataframe)
    
  • Removed Docker files to simplify



hammer_and_pick hammer_and_pick hammer_and_pick

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for prospecting, version 0.1.5
Filename, size File type Python version Upload date Hashes
Filename, size prospecting-0.1.5-py3-none-any.whl (19.9 kB) File type Wheel Python version py3 Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page