Summit Public Schools Snowflake warehouse

These details have not been verified by PyPI

Project links

Homepage

Project description

spswarehouse

1.X Changes

Major Changes

table_utils.upload_to_warehouse has been deprecated. Instead the Warehouse class now has a varietey of native upload_<data> functions that do not require reflecting the target table.
The available Warehouse upload functions are
- Warehouse.upload_df
- Warehouse.upload_google_drive_csv
- Warehouse.upload_google_sheet
- Warehouse.upload_local_csv
Differences between the new upload functions and the old upload_to_warehouse
- These functions take table and schema as two separate arguments instead of reflected_table.
- upload_df does not acccept the sep argument (since the sep argument didn't do anything with dataframes anyways)
- Each function only accepts its listed data type (e.g., upload_df only accepts dataframe, and does not accept csv_filename)
- All other arguments are the same (batch_size, encoding, etc.)
When using table_utils.create_table_stmt and the Warehouse.upload_<data> functions, first, symbols are sanitized to underscores. Then consecutive underscores are collapsed into a single underscore.
- Additionally, it now uses re.compile('[\W_]+') as the basis for replacement, rather than a custom list of symbols (making it consistent with how Summit's Airflow server behaves)
Upgrade to gspread 6.X. The 6.X update of gspread reversed the default argument order of several functions. It's advised that you name arguments instead of relying on position.
The update to SQLAlchemy 2.X significanatly changes how the Warehouse class queries things. The Warehouse class abstracts most of this, but if you were calling Warehouse.engine or Warehouse.conn directly, your code may break.
The helper_ functions were removed from selenium.py. Simply remove the helper_ prefix from the function name to fix.

Minor changes

Most likely, you were upgraded to numpy>=2.0.0 as part of the package update, which may break some minor things.
You know longer have to manually install requirements; the requirements install is built into the spswarehouse install

Prerequisites

Anaconda & Python 3
Jupyter Notebook

Installation

To install, run: pip install spswarehouse
- This can be done from Anaconda Prompt from the Start Menu.
Locate the install directory by running: pip show pip | grep "Location:" | cut -d " " -f2
- If this doesn't work, run pip show pip, then look at the line "Location:".
- Take note of the install directory for the "Set up credentials" step.

The files referred to in this README are in <install-directory>/spswarehouse/.

Updating to new version

When a new version of spswarehouse is released, there are two steps:

pip install --upgrade spswarehouse
Redo the "Set up dependencies" section.

Set up credentials

Navigate to the install directory.
- The default directory where this module is installed is Users\<your name>\Anaconda3\Lib\site-packages\spswarehouse.
- If you are using a custom environment, the directory will probably be Users\<your name>\Anaconda3\envs\<env name>\Lib\site-packages\spswarehouse.
Copy the credentials.py.template file to credentials.py.
Fill in credentials.py with the warehouse information and the Google Service Account information provided by your technical admin.

Snowflake

This allows you to access the Snowflake data warehouse.

Fill in your Snowflake user and password credentials between quotation marks.

Google

This allows you to access your Google Apps.

Fill in all the blank fields in google_config.service_account. See Developer Notes below if you need to generate credentials.

Usage

Snowflake

Your Snowflake connection is configured in credentials.py (see above).

Snowflake access is implemented in by Warehouse. You can:

Read data using read_sql()
Reflect a table using reflect_table()
Run a SQL command using execute()

Uploading data

From Jupyter Notebook, open snowflake-upload-example.ipynb for a basic example.

Column types

create_table_stmt() will try to guess column types when given a DataFrame, CSV file, or Google Sheet.

If you want to explicitly name and type your columns, you can pass in the columns argument instead.

Alternatively, if you want to force all columns to be strings, pass force_string=True. This works for both create_table_stmt() and Warehouse.upload_<data_type>. This does not work if you pass a dataframe.

See the documentation for guess_col_types() for best practices for types.

Google Functions

GoogleDrive, GoogleSheets, GoogleSlides

Make sure you've set up credentials.py first and shared your spreadsheet with the Google service account email. You can also get the email by running any of the following:

GoogleSheets.get_google_service_account_email()
GoogleDrive.get_google_service_account_email()
GoogleSlides..get_google_service_account_email()

GoogleSheets is really an instance of gspread.Client, so you can use the entire gspread Python API.

GoogleDrive is an instance of pydrive2.GoogleDrive, so you can use the PyDrive2 Python API.

GoogleSlides builds directly on the Google Slides API (https://developers.google.com/resources/api-libraries/documentation/slides/v1/python/latest/)

Accessing data

From Jupyter Notebook, open and run googlesheets-example.ipynb for a basic example on loading a spreadsheet and reading sheet data into pandas.DataFrame.

googledrive-example.ipynb contains basic examples of exploring Drive via Jupyter. Note that this class only handles files uploaded to Drive; it's not useful for handling Google Sheets, Google Docs, etc.

Uploading to warehouse

From Jupyter Notebook open and run snowflake-upload-example.ipynb for a basic example on uploading Google Sheet data to the Snowflake warehouse.

Column types

Developer notes

Google service account key

This lets us use the Google Sheets API to access sheet data. It only has to be done once and added to credentials.py.template.

Use an existing Google Developer project, or create a new one: https://console.cloud.google.com
Enable the Google Sheets API
- Go to API & Services for the project, then Libraries.
- Search for "Google Sheets" and select the result.
- Click Enable.
Create the OAuth client credentials
- Go to API & Services for the project, then Credentials.
- Under Create credentials, select Service account key
- Choose an existing service account or create a new one to associate this key with.
- Create the key and download the key as a JSON file.
Copy OAuth client credentials to credentials.py.template in google_client under service-account.
Delete the private_key and leave just the quotation marks when you check in credentials.template.py.
You will need to distribute the private key securely so it can be added to credentials.py.

PyPI

We use PyPI to distribute the spswarehouse module and Test PyPI for testing.

The spswarehouse project is here.

Set up

Create PyPI and Test PyPI accounts to test and upload packages.

Packaging

See https://packaging.python.org/tutorials/packaging-projects/ for an overview and walk-through of PyPI packaging.

Specifics for spswarehouse:

Only build the sdist package. Otherwise, credentials.py and potentially passwords will get distributed in the binary distribution.
If you need to include non-Python files, add them to MANIFEST.in.

Testing

Update version number in setup.py.
Create the package: python setup.py sdist
Upload to Test PyPI: python -m twine upload --repository-url https://test.pypi.org/legacy/ dist/*
Install on local machine to test: pip install spswarehouse==<insert version number> --no-build-isolation -i https://test.pypi.org/simple/
TODO: Figure out how to do this without needing the --no-build-isolation flag

Pushing a new package

Make sure all of your changes are checked into the GitHub repository and your local repository is up-to-date before you do this.

The steps are the same as in the above section, omitting the test.pypi URLs.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.1.0

Apr 23, 2026

1.0.0

Mar 31, 2026

0.8.0

Apr 27, 2026

0.7.6

Jan 6, 2025

0.7.5

Dec 20, 2024

0.7.4

Jul 29, 2024

0.7.3

Jul 16, 2024

0.7.2

Jul 10, 2024

0.7.1

Jun 17, 2024

0.7.0

Jun 12, 2024

0.6.3

Apr 8, 2024

0.6.2

Apr 3, 2024

0.6.1

Nov 14, 2023

0.6.0

Nov 7, 2023

0.5.2

Oct 24, 2023

0.5.1

Oct 23, 2023

0.5.0

Sep 21, 2023

0.4.7

Sep 7, 2023

0.4.6

Sep 6, 2023

0.4.5

Aug 29, 2023

0.4.4

Aug 21, 2023

0.4.3

Aug 14, 2023

0.4.2

Aug 13, 2023

0.4.1

Aug 13, 2023

0.4.0

Aug 13, 2023

0.3.3

Jun 15, 2023

0.3.2

Jun 5, 2023

0.3.1

May 25, 2023

0.3.0

May 25, 2023

0.2.0

May 17, 2023

0.1.1

May 9, 2023

0.1.0

May 8, 2023

0.0.9.0

Jan 4, 2023

0.0.8.0

Mar 9, 2021

0.0.7.6

May 7, 2020

0.0.7.5

May 7, 2020

0.0.7.4

May 7, 2020

0.0.7.3

Feb 26, 2020

0.0.7

Feb 20, 2020

0.0.6

Nov 20, 2019

0.0.4

Nov 14, 2019

0.0.3

May 23, 2019

0.0.2

May 23, 2019

0.0.1

May 3, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spswarehouse-1.1.0.tar.gz (62.9 kB view details)

Uploaded Apr 23, 2026 Source

File details

Details for the file spswarehouse-1.1.0.tar.gz.

File metadata

Download URL: spswarehouse-1.1.0.tar.gz
Upload date: Apr 23, 2026
Size: 62.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for spswarehouse-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`4719eed3217a5edc957b4ad78bfce8937b42c04d829a5b0e3857db77e61d559e`
MD5	`ebb6717a881b4d3794be616a04da6211`
BLAKE2b-256	`2175bf3e5804aabc4bd577b71f294e84fae3b88581b57d9639eebf7dc86c2e96`

See more details on using hashes here.

spswarehouse 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

spswarehouse

1.X Changes

Major Changes

Minor changes

Prerequisites

Installation

Updating to new version

Set up credentials

Snowflake

Google

Usage

Snowflake

Uploading data

Column types

Google Functions

GoogleDrive, GoogleSheets, GoogleSlides

Accessing data

Uploading to warehouse

Column types

Developer notes

Google service account key

PyPI

Set up

Packaging

Testing

Pushing a new package

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes