Skip to main content

A wrapper for google's existing google-cloud python package that aims to make using python inside the Google Cloud framework more intuitive.

Project description

ABOUT

gcloudy is a wrapper for Google's GCP Python package(s) that aims to make interacting with GCP and its services more intuitive, especially for new GCP users. In doing so, it adheres to pandas-like syntax for function/method calls.

The gcloudy package is not meant to be a replacement for GCP power-users, but rather an alternative for GCP users who are interested in using Python in GCP to deploy Cloud Functions and interact with certain GCP services, especially BigQuery and Google Cloud Storage.

The gcloudy package is built on top of cononical Google Python packages(s) without any alteration to Google's base code.

INSTALL, IMPORT, & INITIALIZE

  • gcloudy is installed using pip with the terminal command:

$ pip install gcloudy

  • Once installed, the BigQuery class can be imported from the main GoogleCloud module with:

from gcloudy.GoogleCloud import BigQuery

  • Then, the bq object is initialized with the following (where "gcp-project-name" is your GCP Project ID / Name):

bq = BigQuery("gcp-project-name")

  • NOTE: It is important to also import the Pandas package:

import pandas as pd

METHODS

The following section contains the methods and their usage.

----------------------------

bq.read_bigquery

- Read an existing BigQuery table into a DataFrame.

read_bigquery(bq_dataset_dot_table = None, date_cols = [], preview_top = None, to_verbose = True)

  • bq_dataset_dot_table : the "dataset-name.table-name" path of the existing BigQuery table
  • date_cols : [optional] column(s) passed inside a list that should be parsed as dates
  • preview_top : [optional] only read in the top N rows
  • to_verbose : should info be printed? defaults to True

EX:

my_table = bq.read_bigquery("my_bq_dataset.my_bq_table")
my_table = bq.read_bigquery("my_bq_dataset.my_bq_table", date_cols = ['date'])

-----------

bq.write_bigquery

- Write a DataFrame to a BigQuery table.

write_bigquery(df, bq_dataset_dot_table = None, use_schema = None, append_to_existing = False, to_verbose = True)

  • df : the DataFrame to be written to a BigQuery table
  • bq_dataset_dot_table : the "dataset-name.table-name" path of the existing BigQuery table
  • use_schema : [optional] a custom schema for the BigQuery table. NOTE: see bq.guess_schema below
  • append_to_existing : should the DataFrame be appended to an existing BigQuery table? defaults to False (create new / overwrite)
  • to_verbose : should info be printed? defaults to True

EX:

bq.write_bigquery(my_data, "my_bq_dataset.my_data")
bq.write_bigquery(my_data, "my_bq_dataset.my_data", append_to_existing = True)

-----------

bq.guess_schema

- A helper for bq.write_bigquery, passed to its use_schema arg. Creates a custom schema based on the dtypes of a DataFrame.

guess_schema(df, bq_type_default = "STRING")

  • df : the DataFrame to be written to a BigQuery table
  • bq_type_default : default BQ type passed to dtype 'object'

EX:

bq.write_bigquery(my_data, "my_bq_dataset.my_data", use_schema = bq.guess_schema(my_data))

-----------

bq.read_custom_query

- Read in a custom BigQuery SQL query into a DataFrame.

read_custom_query(custom_query, to_verbose = True)

  • custom_query : the custom BigQuery SQL query that will produce a table to be read into a DataFrame
  • to_verbose : should info be printed? defaults to True

EX:

my_custom_table = bq.read_custom_query("""
    SELECT
        date,
        sales,
        products
    FROM
        my_bq_project_id.my_bq_dataset.my_bq_table
    WHERE
        sales_month = 'June'
""")

-----------

bq.send_query

- Send a custom SQL query to BigQuery. Note, does not return anything as the process is carried out within BigQuery.

send_query(que, to_verbose = True)

  • que : the custom SQL query to be sent and carried out within BigQuery
  • to_verbose : should info be printed? defaults to True

EX:

bq.send_query("""
    CREATE TABLE my_bq_project_id.my_bq_dataset.my_new_bq_table AS 
    (
        SELECT
            date,
            sales,
            products
        FROM
            my_bq_project_id.my_bq_dataset.my_bq_table
        WHERE
            sales_month = 'June'
    )
""")

-----------

bq.read_gcs

- Read a CSV file stored within a Google Cloud Storage (GCS) Bucket into a DataFrame.

read_gcs(gsutil_uri, date_cols = None, to_verbose = True)

  • gsutil_uri : the GCS Bucket path of the existing CSV file
  • date_cols : [optional] column(s) passed inside a list that should be parsed as dates
  • to_verbose : should info be printed? defaults to True

EX:

my_table = bq.read_gcs("gs://my-bucket/my_data.csv")
my_table = bq.read_gcs("gs://my-bucket/my_data.csv", date_cols = ['date'])

-----------

bq.write_gcs

- Write a Pandas DataFrame to a Google Cloud Storage (GCS) Bucket as a CSV.

write_gcs(pandas_df, gsutil_uri, keep_index = False, to_verbose = True)

  • pandas_df : the Pandas DataFrame to be written to a Google Cloud Storage (GCS) Bucket as a CSV
  • gsutil_uri : the GCS Bucket path
  • keep_index : should the DataFrame index be written as well? defaults to False
  • to_verbose : should info be printed? defaults to True

EX:

bq.write_gcs(my_data, "gs://my-bucket/my_data.csv")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gcloudy-1.3.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

gcloudy-1.3.0-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file gcloudy-1.3.0.tar.gz.

File metadata

  • Download URL: gcloudy-1.3.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.5

File hashes

Hashes for gcloudy-1.3.0.tar.gz
Algorithm Hash digest
SHA256 a33f566b8a1919f1e2f67d6c2c6b0a4a8160dd377be53e9069d303a604530804
MD5 b76e7cccccdeff65213478349589abca
BLAKE2b-256 78e811f55ad2e801a201f9ee40104765dc142603c910140443adfadfc643a741

See more details on using hashes here.

File details

Details for the file gcloudy-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: gcloudy-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 6.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.5

File hashes

Hashes for gcloudy-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5d64c3060b83ed9f8b63b1d322e14ea7f8f7e6c095f0252e7a18a18b7ea14321
MD5 82f6c73109708ae184c436882ae6652d
BLAKE2b-256 a1c8dc2625bd6c7d03c941e3d4c191ce2b1f2af11a2a69d78707965a1533ce87

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page