Skip to main content

Wrapper around Google Cloud Platform BigQuery client to simplify management of the BigQuery tables.

Project description

GitHub GitHub Workflow Status (with branch) Coverage PyPI - Downloads PyPI

Introduction

This package is a wrapper around the Google Cloud BigQuery API to simplify management of the BigQuery tables. The specification of the BigQuery tables is realized with the help of YAML file and the package provides the functionality to:

  • instantiate BigQuery table based on YAML specification
  • create BigQuery based on created instance
  • drop BigQuery table
  • truncate BigQuery table
  • load data into BigQuery table from blob in GCS bucket

Installation

pip install surquest-GCP-bq-grid

Getting started

Let's assume that we have a YAML file that specifies BigQuery table as follows:

name: users
desc: Table with all users
labels:
  company: surquest
  application: data-services
clustering_fields:
  - department
time_partitioning:
  field: created_at
  type: DAY
schema:
  - name: id
    desc: ID of the user
    mode: required
    type: INTEGER
  - name: name
    desc: First name and last name of the user
    mode: required
  - name: department
    desc: Description of the user
  - name: height
    desc: Height of the user in centimeters
    type: FLOAT
  - name: roles
    desc: List of roles of the user
    type: STRUCT
    mode: repeated
    fields:
      - name: role
        desc: Role of the user
        mode: required
      - name: description
        desc: Description of the role
  - name: last_login_at
    desc: Date and time when the user last logged in
    type: TIMESTAMP
    mode: REQUIRED
  - name: created_at
    desc: Date and time when the user was created
    type: TIMESTAMP
    mode: NULLABLE
    defaultValueExpression: CURRENT_TIMESTAMP()
  - name: created_by
    desc: User who created the user
    type: STRING
    mode: NULLABLE
    defaultValueExpression: SESSION_USER()
  - name: is_active
    desc: Indicates if the user record is active
    type: BOOLEAN
    mode: NULLABLE
    defaultValueExpression: true

Please note:

  • the default type of the column is string
  • the default mode is nullable
  • the desc is optional and can be omitted

More details about field specification can be fond here: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#TableFieldSchema.

Table specified by the above YAML file can be created with the following code:

from surquest.GCP.bq.grid import Grid

# create instance of the Grid
grid = Grid.from_yaml(
    path="path/to/the/yaml/file",
    dataset="dataset_name"
)
grid.exit() # check if table exists
grid.create() # create table in BigQuery
grid.load(
    blob_uri="gs://bucket_name/blob_name",
    mode="WRITE_TRUNCATE",
    format="CSV"
) # load data into BigQuery table from blob in GCS bucket
grid.truncate() # truncate table in BigQuery
grid.drop() # drop table in BigQuery

Following python script create BigQuery table as shown in following screenshot:

BigQuery Table

Local development

You are more than welcome to contribute to this project. To make your start easier we have prepared a docker image with all the necessary tools to run it as interpreter for Pycharm or to run tests.

Build docker image

docker build `
     --tag surquest/gcp/bq/grid `
     --file package.base.dockerfile `
     --target test .

Run tests

docker run --rm -it `
 -v "${pwd}:/opt/project" `
 -e "GOOGLE_APPLICATION_CREDENTIALS=/opt/project/credentials/TEST/key.file.json" `
 -w "/opt/project/test" `
 surquest/gcp/bq/grid pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

surquest_gcp_bq_grid-0.0.1rc5.tar.gz (81.3 kB view details)

Uploaded Source

Built Distribution

surquest_gcp_bq_grid-0.0.1rc5-py2.py3-none-any.whl (6.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file surquest_gcp_bq_grid-0.0.1rc5.tar.gz.

File metadata

File hashes

Hashes for surquest_gcp_bq_grid-0.0.1rc5.tar.gz
Algorithm Hash digest
SHA256 5afa0ec8493954a15716664ca6371d38f41ac6bbfb161d999315cdba63c2a51e
MD5 f3e375dfbbfd2fd9183babec0e0ea996
BLAKE2b-256 99bb3f785b6e0ea963517505dbb7f2cd0fc5540723bb1c4b1643fe800d067543

See more details on using hashes here.

File details

Details for the file surquest_gcp_bq_grid-0.0.1rc5-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for surquest_gcp_bq_grid-0.0.1rc5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 b0205fa6ef07c9a649c0c49d7f10b823a5e6534874b09eb4c5fbfba743ee84b7
MD5 4df0dde7de415cb90b7f692ac1148f02
BLAKE2b-256 84a574fe43a1ff67ccf808ad79f7304c61281acec4a35caa07e1c479fde0bf32

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page