Skip to main content

Wrapper around Google Cloud Platform BigQuery client to simplify management of the BigQuery tables.

Project description

GitHub GitHub Workflow Status (with branch) Coverage PyPI - Downloads PyPI

Introduction

This package is a wrapper around the Google Cloud BigQuery API to simplify management of the BigQuery tables. The specification of the BigQuery tables is realized with the help of YAML file and the package provides the functionality to:

  • instantiate BigQuery table based on YAML specification
  • create BigQuery based on created instance
  • drop BigQuery table
  • truncate BigQuery table
  • load data into BigQuery table from blob in GCS bucket

Installation

pip install surquest-GCP-bq-grid

Getting started

Let's assume that we have a YAML file that specifies BigQuery table as follows:

name: users
desc: Table with all users
labels:
  company: surquest
  application: data-services
clustering_fields:
  - department
time_partitioning:
  field: created_at
  type: DAY
schema:
  - name: id
    desc: ID of the user
    mode: required
    type: INTEGER
  - name: name
    desc: First name and last name of the user
    mode: required
  - name: department
    desc: Description of the user
  - name: height
    desc: Height of the user in centimeters
    type: FLOAT
  - name: roles
    desc: List of roles of the user
    type: STRUCT
    mode: repeated
    fields:
      - name: role
        desc: Role of the user
        mode: required
      - name: description
        desc: Description of the role
  - name: last_login_at
    desc: Date and time when the user last logged in
    type: TIMESTAMP
    mode: REQUIRED
  - name: created_at
    desc: Date and time when the user was created
    type: TIMESTAMP
    mode: NULLABLE
    defaultValueExpression: CURRENT_TIMESTAMP()
  - name: created_by
    desc: User who created the user
    type: STRING
    mode: NULLABLE
    defaultValueExpression: SESSION_USER()
  - name: is_active
    desc: Indicates if the user record is active
    type: BOOLEAN
    mode: NULLABLE
    defaultValueExpression: true

Please note:

  • the default type of the column is string
  • the default mode is nullable
  • the desc is optional and can be omitted

More details about field specification can be fond here: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#TableFieldSchema.

Table specified by the above YAML file can be created with the following code:

from surquest.GCP.bq.grid import Grid

# create instance of the Grid
grid = Grid.from_yaml(
    path="path/to/the/yaml/file",
    dataset="dataset_name"
)
grid.exit() # check if table exists
grid.create() # create table in BigQuery
grid.load(
    blob_uri="gs://bucket_name/blob_name",
    mode="WRITE_TRUNCATE",
    format="CSV"
) # load data into BigQuery table from blob in GCS bucket
grid.truncate() # truncate table in BigQuery
grid.drop() # drop table in BigQuery

Following python script create BigQuery table as shown in following screenshot:

BigQuery Table

Local development

You are more than welcome to contribute to this project. To make your start easier we have prepared a docker image with all the necessary tools to run it as interpreter for Pycharm or to run tests.

Build docker image

docker build `
     --tag surquest/gcp/bq/grid `
     --file package.base.dockerfile `
     --target test .

Run tests

docker run --rm -it `
 -v "${pwd}:/opt/project" `
 -e "GOOGLE_APPLICATION_CREDENTIALS=/opt/project/credentials/TEST/key.file.json" `
 -w "/opt/project/test" `
 surquest/gcp/bq/grid pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

surquest_gcp_bq_grid-0.0.1rc7.tar.gz (81.4 kB view details)

Uploaded Source

Built Distribution

surquest_gcp_bq_grid-0.0.1rc7-py2.py3-none-any.whl (6.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file surquest_gcp_bq_grid-0.0.1rc7.tar.gz.

File metadata

File hashes

Hashes for surquest_gcp_bq_grid-0.0.1rc7.tar.gz
Algorithm Hash digest
SHA256 af8af20d7c289a44cc2ffb46a516f364fe9e205de4da69a96353ba5d565d4169
MD5 7223cd4a54974830e190e0149b529d2c
BLAKE2b-256 63214169b1a60da15d63af7b7f5fe2a657dad4743a7ddd090d25303dd6ca90d5

See more details on using hashes here.

File details

Details for the file surquest_gcp_bq_grid-0.0.1rc7-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for surquest_gcp_bq_grid-0.0.1rc7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 a36d6782369af2067f983b6bc40543995d1cd7fc62517a872dbc4dac3e9abc37
MD5 5fb05673eace4cf1b6ce24139ff77240
BLAKE2b-256 bdf68b0cf36d0eb962a8ba7f60e6fc7e1dd1ead6504294710ac4267261ce731e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page