Skip to main content

Package for generating Google slide decks

Project description

logo

Table of contents

What does this do

This is a python application to generate Google Slide decks using Google Sheets for the data source. It is specifically tailored for the item data maintained by Artemis Book Sales.

Given a Google Sheet with the appropriate data, this application will create a Google Slide deck containing a title slide and one slide per item with the appropriate image and text data formatted on each slide.

Installation

You need Python 3.7 or greater and the pip package management tool installed. These instructions also assume you have Git installed for source code management.

Clone Repo

cd desired/development/location
git clone https://gitlab.com/johnduarte/artemis_slide_generator.git
cd artemis_slide_generator

Setup Python environment

Create a python virtual environment named pyvenv at the root of the repository checkout and then activate it. With the environment activated, install the package to enable the CLI.

On Windows, open the Command shell or PowerShell and execute the following commands to set up the virtual environment.

In Windows Command Shell:

cd Development\Location\artemis_slide_generator
python3 -m venv .\pyvenv
pyvenv\Scripts\activate.bat
python -m pip install --upgrade pip
pip install --editable .

In Windows PowerShell:

cd Development\Location\artemis_slide_generator
python3 -m venv .\pyvenv
.\pyvenv\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install --editable .

On *Nix, open the shell of your choice (the commands below use bash) and execute the following commands to set up the virtual environment.

cd development/location/artemis_slide_generator
python3 -m venv ./pyvenv
source pyvenv/bin/activate
python -m pip install --upgrade pip
pip install --editable .

Setup PowerShell activation script

Assumptions:

  • Python is installed
  • PowerShell is installed

Enable Local Scripts

PowerShell execution policy needs to be updated in order to add scripts to be run. We will set the policy to allow locally created scripts but deny execution for scripts downloaded from the internet.

See Microsoft PowerShell Exectution Policies.

PS> Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned

Ensure Profile

See if a PowerShell profile is defined:

PS> $profile

The above will return false or a file path. false means the profile is not defined. If a file path is returned it may not exist yet. Check to see if the profile path exists.

PS> type $profile

If the above returns a type: cannot find path '<PATH>' because it does not exist error, then the profile file does not exist.

Only if the profile is not defined or the profile file does not exist, Create the file.

PS> ni -Force $profile

Create Profile content

Find the full path to your artemis_slide_generator python virtual environment.

Open the profile file in notepad

PS> notepad $profile

Add the following content to your profile

# Microsoft.PowerShell_profile.ps1

Set-Alias python-sys  $(python -c 'import sys; print(sys.executable)')

function activate-artemis-pyvenv {
& 'C:\Path\to\artemis_slide_generator\pyvenv\Scripts\Activate.ps1'
}

Restart your PowerShell session to load the updated profile.

Activate the artemis Python Virtual Environment

activate-artemis-pyvenv

Google API

The Google API client library for Python is required. See the Google Slides API Quickstart for instructions on setting up API authorization.

Important: The setup above will produce a credentials.json file needed to authenticate to the Google Slides and Sheet API. You will need to copy/move it to the root of the repository checkout prior to running any commands.

  • Save the credentials.json file to the project directory.
  • Modify the file permissions on the token to make it READ-ONLY.

Google Cloud

Using Google Cloud Storage requires a private service account key as an authentication mechanism. Signed URLs will be generated to put the images into the slides. This requires a service account and service account key file.

Enable Google Cloud Storage API

  • In the Google Cloud account dashboard, click 'APIs & Services' from the navigation menu.
  • Click '+ ENABLE APIS AND SERVICES'
  • Search for 'cloud storage'
  • Enable Google Cloud Storage JSON API
  • Enable Google Cloud Storage

Create Service Account

  • In the Google Cloud account dashboard, select 'APIs & Services' from the navigation menu. Then click 'Credentials' in the sub-menu.
  • Click '+ CREATE CREDENTIALS'
  • Choose 'Service Account' from the drop-down menu.
  • Name the account and click 'Create'.
  • Choose 'Owner' from the 'Basic' roll in the 'Grant access' section.
  • Click 'Continue'
  • Click 'Done'

Create Service Account Key

  • Open the service account from the Credentials list in the Google Cloud account.
  • Click 'KEYS'
  • Click 'ADD KEY'
  • Click 'CREATE NEW KEY'
  • Choose JSON format
  • Click 'CREATE'
  • Save the key to the project directory.
  • Modify the file permissions on the key to make it READ-ONLY.

Important: The application will not work as expected unless the Google Cloud key is present in the root of the repository checkout.

Development Note: Authentication from within python

Ensure that your python venv environment is setup and that the google-cloud-storage library is installed. Also, ensure that the Google Cloud service account key is stored in the project directory.

Putting the following lines in a python file will authenticate to Google Cloud Storage.

import os
from google.cloud import storage

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '<NAME-OF-KEY-FILE>'
client = storage.Client()

Refs:

Create .env for needed environment variables

The .env.example file contains all of the environment variables needed by the application. In order for the application to see them it needs to be copied to .env.

Copy the provided .env.example to .env in the root of the repository checkout.

You then need to update the values as necessary to work for your situation. The values supplied are as follows:

ASG_UPLOAD_SOURCE=downloaded_images
GOOGLE_CLOUD_BUCKET=my_bucket
GOOGLE_CLOUD_BUCKET_PREFIX=my_bucket_prefix
GOOGLE_CLOUD_KEY_FILE=my_cloud_key.json
ASG_SCRAPED_DATAFILE=scraped_items.json
ASG_SAVED_IMAGES_DIR=downloaded_images
ASG_VENDOR_DATAFILE=vendors.json
# For pytest integration testing
ASG_TEST_SHEET_ID="GOOGLE_SHEET_ID_HERE"
ASG_TEST_SHEET_TAB="GOOGLE_SHEET_TAB_HERE"

The environment variables starting with "GOOGLE_CLOUD_" must be changed to reflect the Google Cloud bucket, prefix, and key file associated with the Google Cloud Storage API service account key. The environment variables starting with "ASG_" can be left as they are. However, if you prefer other names for these, you may change them.

Final Setup Checklist

Once you have completed all of the above setup steps, run through the following checklist to make sure you are ready to proceed.

  • Google API credentials.json is located at root of checkout.
  • Google Cloud Service key is located at root of checkout. .env has been updated with valid values for the following:
    • GOOGLE_CLOUD_BUCKET - This should be set to a valid cloud bucket.
    • GOOGLE_CLOUD_BUCKET_PREFIX - This should be set to a valid cloud bucket prefix.
    • GOOGLE_CLOUD_KEY_FILE - This should be set to the name of your Service key file.
  • artemis_sg --help - results in usage text showing the defined subcommands.

How do I run it

For each work session, you will need to activate the python virtual environment prior to executing any commands. Once the environment is activated, you can execute the Slide Generator Workflow as outlined below or run any of the commands independently as needed.

Session Setup

Session setup comprises the following steps:

  • Change directories to the root of the artemis_slide_generator repository checkout.
  • Activate the previously created python virtual environment.
  • Pull the latest changes with Git.
  • Install the new version with pip.

On Windows, open the Command shell or PowerShell (the commands below use the Command shell) and execute the following commands to set up the session.

In Windows Command Shell:

cd Development\Location\artemis_slide_generator
pyvenv\Scripts\activate.bat
git pull
pip install --editable .

In Windows PowerShell:

cd Development\Location\artemis_slide_generator
.\pyvenv\Scripts\Activate.ps1
git pull
pip install --editable .

On *Nix, open the shell of your choice (the commands below use bash) and execute the following commands to set up the session.

cd development/location/artemis_slide_generator
source pyvenv/bin/activate
git pull
pip install --editable .

Slide Generator Workflow

In order to produce a slide deck, the following workflow is prescribed. These elements are broken into separate components so that they might be executed without a defined pipeline if needed.

The package includes a set of subcommands under the unified CLI command artemis_sg once it is installed to facilitate this workflow. See the complete CLI Command Reference for more detail on each of these commands.

Steps in the workflow that are a manual task not handled by the software are highlighted with the Manual tag.

Create Spreadsheet

Manual

Create spreadsheet that includes the field titles in row 1 and the desired slide records in subsequent rows. The spreadsheet must include a column for ISBN numbers. The ISBN numbers are assumed to be in the ISBN-13 format. Make a note of the location of this spreadsheet in your file system and the name of the worksheet used. You may want to use this information in the spreadsheet images workflow.

Upload Spreadsheet to Google Drive

Manual

Upload the spreadsheet to Google Drive using a web browser to access the Google Drive interface.

The spreadsheet will be uploaded to the root of your Google Drive.

Convert Spreadsheet to Google Drive

Manual

Convert the spreadsheet to a Google Sheets document using a web browser to access the Google Drive interface.

A Google Sheets document will be created and opened in a new tab. This document will also be saved to the root of your Google Drive. Feel free to move the Google Sheets and/or the uploaded spreadsheet document wherever you like in your Google Drive file system. We will only be referencing them by ID going forward.

Make a note of the google ID for the Google Sheets document as well as the name of sheet used for the data. The ID can be found in the URI for the document. The ID occurs between the /d/ and the /edit segments of the URI. Here is an example URI for reference.

https://docs.google.com/spreadsheets/d/MY_GOOGLE_SHEET_ID_IS_HERE/edit#gid=267687802

The name of the sheet used for the data within the document is shown toward the lower right corner when working in the Google Sheet document. By default, it is usually named "Sheet1".

Add/Update Vendor

Manual

If this is your first time using this application, copy the vendors.json.template to vendors.json. This will provide you with the format for the database.

On *Nix

cp vendors.json.template vendors.json

On Windows

copy vendors.json.template vendors.json

Open vendors.json in your favorite text editor. If there is not an existing record for the vendor, add one with the following pattern, including the field key used for ISBN numbers.

If there is an existing record, update the spreadsheet id and sheet name.

The format is as follows:

{
  "sample_vendor": {
    "name": "Sample Vendor",
    "isbn_key": "ISBN-13"
  }
}

Scrape Data

Run the artemis_sg scrape command to save the item descriptions and image URLs for each record in the spreadsheet to the file defined by the environment variable ASG_SCRAPED_DATAFILE. The command requires a valid vendor code argument to map to the applicable vendor record updated in the file referenced by the ASG_VENDOR_DATAFILE in the workflow step above. The command also requires a valid Google Sheet ID and the name of the Google Sheet tab in which the item data resides.

Example:

artemis_sg --verbose scrape sample_vendor MY_GOOGLE_SHEET_ID MY_GOOGLE_SHEET_TAB

Download Images

Download images from the scraped data using the artemis_sg download command.

Example:

artemis_sg --verbose download

Upload Images to Google Cloud

Run the artemis_sg upload command to upload previously download images to Google Cloud.

Example:

artemis_sg --verbose upload

Generate Slide Deck

Run the artemis_sg generate command to create a Google Slide deck of the items in the spreadsheet including the description and images gathered by the scrape workflow step. You should set a desired slide title using the --title flag. The command requires a valid vendor code, a valid Google Sheet ID, and the name of the Google Sheet tab in which the item data resides.

Example:

artemis_sg generate --title "Badass presentation" sample_vendor MY_GOOGLE_SHEET_ID MY_GOOGLE_SHEET_TAB

Spreadsheet Images Workflow

In order to produce a spreadsheet with thumbnail images added for items, the following workflow is suggested.

The following steps are shared with the slide generator workflow. These steps are linked to the appropriate step in that workflow rather then duplicating them here.

The unique steps for this workflow are:

Create Thumbnails

Create thumbnail images from previously downloaded images using the artemis_sg mkthumbs command.

Example:

artemis_sg --verbose mkthumbs

Add Thumbnails to Spreadsheet

Create a new Excel spreadsheet containing thumbnail images column and populated with available thumbnails using the artemis_sg sheet-image command. You will need to supply a vendor code. You will also need to supply the local path to the spreadsheet file as well as the name of the worksheet that the data is located in. This information can be noted from the Create Spreadsheet step.

By default, the new Excel file is saved as "out.xlsx" in the directory from which the command was run (typically the root of the repository checkout). The --output option can be used to specify a desired output file. The specification for the --output file can include either an absolute or relative path location for the file as well.

Example:

artemis_sg --verbose sheet-image sample_vendor "C:\Users\john\Documents\spreadsheets\my_spreadsheet.xlsx" "Products"

Example, specifying output file with an absolute file path:

artemis_sg sheet-image ^
--output "C:\Users\john\Documents\spreadsheets\my_new_spreadsheet.xlsx" ^
sample_vendor ^
"C:\Users\john\Documents\spreadsheets\my_spreadsheet.xlsx" ^
"Products"

Example, specifying output file with a relative file path:

artemis_sg sheet-image ^
--output "..\..\my_new_spreadsheet.xlsx" ^
sample_vendor ^
"C:\Users\john\Documents\spreadsheets\my_spreadsheet.xlsx" ^
"Products"

CLI Command Reference

The Artemis Slide Generator consists of a single artemis_sg command with many subcommands.

Artemis_sg usage: artemis_sg [OPTIONS] COMMAND [ARGS]...

The artemis_sg command provides optional --verbose and --debug options to increase the level of feedback provided during execution.

A --help option is available to display command usage and available subcommands.

artemis_sg --help

All artemis_sg subcommands also provide a --help option to display usage for the given subcommand.

Example:

artemis_sg scrape --help

Artemis_sg scrape

The artemis_sg scrape command iterates over the item rows in the spreadsheet provided by the SHEET_ID and SHEET_TAB arguments. For each ISBN, it searches for item descriptions and images in a web browser. It collects this information and stores it in the file defined by the environment variable ASG_SCRAPED_DATAFILE. If data for an ISBN already exists in the datafile, the ISBN is skipped and does not result in re-scraping data for that record.

Artemis_sg scrape usage: artemis_sg scrape [OPTIONS] VENDOR SHEET_ID SHEET_TAB

The command requires a VENDOR argument be passed to it. This argument is a vendor key that should match a vendor key in the datafile referenced by the environement variable ASG_VENDOR_DATAFILE. This key is used to look up the vendor record in the ASG_VENDOR_DATAFILE to find the appropriate ISBN_key associated with the vendor's data in their spreadsheets.

The command also requires SHEET_ID and SHEET_TAB arguments. These reference the Google Sheet ID and workbook within it to access the item rows on which to conduct its work.

The command utilizes environment variables stored in .env to set the vendor database from ASG_VENDOR_DATAFILE and scraped items database from ASG_SCRAPED_DATAFILE.

Artemis_sg download

The artemis_sg download command iterates over the data records in the file defined by the environment variable ASG_SCRAPED_DATAFILE. For each record, it downloads the image files associated with the record to a local directory as defined by the environment variable ASG_SAVED_IMAGES_DIR.

Artemis_sg download usage: artemis_sg download [OPTIONS]

The command takes no arguments. Other then --help, no additional options are available.

Artemis_sg upload

The artemis_sg upload command uploads the files in the directory defined by the environment variable ASG_UPLOAD_SOURCE to the Google Cloud bucket defined by the environment variable GOOGLE_CLOUD_BUCKET. Only the first level of the source directory is uploaded. Subdirectories of the source directory are not traversed for the upload. All uploaded files are prefixed with value defined by the environment variable GOOGLE_CLOUD_BUCKET_PREFIX.

Artemis_sg upload usage: artemis_sg upload [OPTIONS]

The command takes no arguments. Other then --help, no additional options are available.

Artemis_sg generate

The artemis_sg generate command generates a Google Slides document.

The slide deck will be given a title based on the values supplied for VENDOR and --title. The title slide will be in the following format:

Artemis Book Sales Presents...
Vendor Name, Title

The artemis_sg generate command then iterates over the item rows in the spreadsheet/tab provided as SHEET_ID and SHEET_TAB arguments. For each row in the spreadsheet for which it has image data it creates a slide containing the spreadsheet data, the description saved in the file defined by the environment variable ASG_SCRAPED_DATAFILE, and the images saved in the GOOGLE_CLOUD_BUCKET. The Google sheet will be saved to the root of the Google Drive associated with the credentials created during initial installation.

Artemis_sg generate usage: artemis_sg generate [OPTIONS] VENDOR SHEET_ID SHEET_TAB

The command requires a VENDOR argument be passed to it. This argument is a vendor key that should match a key in the datafile referenced by the environment variable ASG_VENDOR_DATAFILE. This key is used to look up the vendor record in the ASG_VENDOR_DATAFILE to find the appropriate ISBN_key associated with the vendor's data as well as the vendor's name.

The command requires a SHEET_ID argument to be passed to it. This argument is Google Sheet ID whose ISBN column label matches that of the vendor's ISBN_key.

The command requires a SHEET_TAB argument to be passed to it. This argument is the workbook tab label in the given SHEET_ID that contains the data that the command should access to conduct its work.

The command utilizes environment variables stored in .env to set the vendor database from ASG_VENDOR_DATAFILE and scraped items database from ASG_SCRAPED_DATAFILE.

Options

The command provides optional options for overriding default values used during execution.

  • -t or --title: Sets the slide deck title used on the first slide in the deck. This option should normally be set to prevent the default value of "New Arrivals" from being used.

Artemis_sg mkthumbs

The artemis_sg mkthumbs command creates thumbnail images from images located in a given directory. These thumbnail images are saved to a thumbnails subdirectory in the original image directory. These files are given the same names as their originals. By default, the command will use the directory defined by the environment variable ASG_SAVED_IMAGE_DIR. All thumbnails are 130 pixels x 130 pixels in size.

For example, given the default value of downloaded_images for ASG_SAVED_IMAGE_DIR with a single image in it, running the command would result in the following layout.

downloaded_images/
├── 9780228101208.jpg
└── thumbnails
    └── 9780228101208.jpg

Artemis_sg mkthumbs usage: artemis_sg mkthumbs [OPTIONS]

Options

The command provides optional options for overriding default values used during execution.

  • --image-directory: An alternate image directory to use for generating thumbnails should you wish to use it on a set of images other than those in the ASG_SAVED_IMAGE_DIR.

Artemis_sg sheet-image

The artemis_sg sheet-image command modifies a local Excel spreadsheet file to include thumbnail images in the second column for items in which local thumbnail image files are available and saves it as a new file. By default, the thumbnail images are assumed to be in $ASG_SAVED_IMAGE_DIR/thumbails. By default, the new Excel file is saved as out.xlsx.

Artemis_sg sheet-image usage: artemis_sg sheet-image [OPTIONS] VENDOR WORKBOOK WORKSHEET

The command takes required VENDOR, WORKBOOK, and WORKSHEET arguments in that order. The VENDOR is needed to look up the proper ISBN field name. The WORKBOOK argument is the path to the Excel spreadsheet document to which you want to add images. The WORKSHEET argument is the name of the worksheet within the Excel WORKBOOK file which contains the data to which you want to add images.

Options

The command provides optional options for overriding default values used during execution.

  • --output FILE: Write the modified Excel file to a specified output file instead of the default out.xlsx.

Example:

artemis_sg --debug sheet-image --output Spreadsheet_with_images.xlsx sample_vendor Test_Spreadsheet.xlsx "To Slides"

Artemis_sg order

The artemis_sg order command populates the website cart for a given vendor with items from a local Excel spreadsheet file. The browser instance with the populated cart is left open for the user to review and manually complete the order. The user will be asked to manually login during the execution of this command.

Artemis_sg sheet-image usage: artemis_sg order [OPTIONS] VENDOR WORKBOOK WORKSHEET

The command takes required VENDOR, WORKBOOK, and WORKSHEET arguments in that order. The VENDOR is needed to invoke the proper website for adding items to, as well as to look up the proper ISBN field name from the spreadsheet. The WORKBOOK argument is the path to the Excel spreadsheet document in which the items to be ordered are located. The WORKSHEET argument is the name of the worksheet within the Excel WORKBOOK file which contains the order data.

Options

The command provides optional options for overriding default values used during execution.

  • --email EMAIL: Use the provided email to impersonate a TB customer.

Example:

artemis_sg order --email foo@example.org sample_vendor Spreadsheet.xlsx Sheet1

NOTE: The WORKBOOK is expected to contain a column named Order which contains the quantity of the items to add to the cart.

NOTE: The browser opened by this command is controlled by this command. The browser will automatically close and the session will be terminated at the end of the defined waiting period. If the web order has not been completed by the end of the waiting period, the cart will be lost.

Testing

Pytest is used for testing. All tests are in the tests directory. The full test suite can be run with the following command.

pytest

Some of the tests are full integration tests that assume connections to the internet and a Google Cloud account. The full integration tests need access to a Google Sheet. The sheet for these tests should be defined in .env using the following variables. The Google sheet should also have a small number of records in it and the ISBN column should use the heading "ISBN-13". These tests will generate a slide deck on the authenticated account. Such slide decks will need to be manually deleted since the application does not have permission to do so.

ASG_TEST_SHEET_ID="GOOGLE_SHEET_ID_HERE"
ASG_TEST_SHEET_TAB="GOOGLE_SHEET_TAB_HERE"

The full integration tests are time consuming and can be skipped using the following command.

pytest -m 'not integration'

Release Steps (WIP)

The release steps are executed using the hatch Python project manager.

Hatch will dynamically create the version number.

Lint

Ensure that the code passes the lint checks with the following command.

hatch run lint

Test

All "not integration" tests must pass before building a package release candidate. Run the following command and verify that ALL tests pass.

hatch run test

Build

Build source and built distribution packages and remove previous builds with the following command.

hatch build -c

Publish

This assumes publishing to pypi.org. An API token will be necessary to authorize publication.

Publish the build using the following command. You can set the user and auth via the environment variables HATCH_INDEX_USER and HATCH_INDEX_AUTH respectively if you don't want to pass them on the command line.

hatch publish --user __token__ --auth pypi-<SECRET_TOKEN>

If publishing to test.pypi.org add the --repo test option to the publish command.

hatch publish -r test -u __token__ -a pypi-<SECRET_TOKEN>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

artemis_sg-0.2.5.tar.gz (68.0 kB view hashes)

Uploaded Source

Built Distribution

artemis_sg-0.2.5-py3-none-any.whl (47.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page