Skip to main content

Wrapper around Google Cloud Platform BigQuery client to simplify management of the BigQuery tables.

Project description

GitHub GitHub Workflow Status (with branch) Coverage PyPI - Downloads PyPI

Introduction

This package is a wrapper around the Google Cloud BigQuery API to simplify management of the BigQuery tables. The specification of the BigQuery tables is realized with the help of YAML file and the package provides the functionality to:

  • instantiate BigQuery table based on YAML specification
  • create BigQuery based on created instance
  • drop BigQuery table
  • truncate BigQuery table
  • load data into BigQuery table from blob in GCS bucket

Installation

pip install surquest-GCP-bq-grid

Getting started

Let's assume that we have a YAML file that specifies BigQuery table as follows:

name: users
desc: Table with all users
labels:
  company: surquest
  application: data-services
clustering_fields:
  - department
time_partitioning:
  field: created_at
  type: DAY
schema:
  - name: id
    desc: ID of the user
    mode: required
    type: INTEGER
  - name: name
    desc: First name and last name of the user
    mode: required
  - name: department
    desc: Description of the user
  - name: height
    desc: Height of the user in centimeters
    type: FLOAT
  - name: roles
    desc: List of roles of the user
    type: STRUCT
    mode: repeated
    fields:
      - name: role
        desc: Role of the user
        mode: required
      - name: description
        desc: Description of the role
  - name: last_login_at
    desc: Date and time when the user last logged in
    type: TIMESTAMP
    mode: REQUIRED
  - name: created_at
    desc: Date and time when the user was created
    type: TIMESTAMP
    mode: NULLABLE
    defaultValueExpression: CURRENT_TIMESTAMP()
  - name: created_by
    desc: User who created the user
    type: STRING
    mode: NULLABLE
    defaultValueExpression: SESSION_USER()
  - name: is_active
    desc: Indicates if the user record is active
    type: BOOLEAN
    mode: NULLABLE
    defaultValueExpression: true

Please note:

  • the default type of the column is string
  • the default mode is nullable
  • the desc is optional and can be omitted

More details about field specification can be fond here: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#TableFieldSchema.

Table specified by the above YAML file can be created with the following code:

from surquest.GCP.bq.grid import Grid

# create instance of the Grid
grid = Grid.from_yaml(
    path="path/to/the/yaml/file",
    dataset="dataset_name"
)
grid.exit() # check if table exists
grid.create() # create table in BigQuery
grid.load(
    blob_uri="gs://bucket_name/blob_name",
    mode="WRITE_TRUNCATE",
    format="CSV"
) # load data into BigQuery table from blob in GCS bucket
grid.truncate() # truncate table in BigQuery
grid.drop() # drop table in BigQuery

Following python script create BigQuery table as shown in following screenshot:

BigQuery Table

Local development

You are more than welcome to contribute to this project. To make your start easier we have prepared a docker image with all the necessary tools to run it as interpreter for Pycharm or to run tests.

Build docker image

docker build `
     --tag surquest/gcp/bq/grid `
     --file package.base.dockerfile `
     --target test .

Run tests

docker run --rm -it `
 -v "${pwd}:/opt/project" `
 -e "GOOGLE_APPLICATION_CREDENTIALS=/opt/project/credentials/TEST/key.file.json" `
 -w "/opt/project/test" `
 surquest/gcp/bq/grid pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

surquest_gcp_bq_grid-0.0.1rc2.tar.gz (80.6 kB view details)

Uploaded Source

Built Distribution

surquest_gcp_bq_grid-0.0.1rc2-py2.py3-none-any.whl (6.4 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file surquest_gcp_bq_grid-0.0.1rc2.tar.gz.

File metadata

File hashes

Hashes for surquest_gcp_bq_grid-0.0.1rc2.tar.gz
Algorithm Hash digest
SHA256 d419a5a4cdcff38fb4fbb8fde1b182d8434466bda88c16f5c93589754208cf44
MD5 e8253c26c4ea552cc6bee51c62099a7d
BLAKE2b-256 a539620b1ca998dff610a004aabf2e0edb7421e54d67db3b773d48d16bf1dda7

See more details on using hashes here.

File details

Details for the file surquest_gcp_bq_grid-0.0.1rc2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for surquest_gcp_bq_grid-0.0.1rc2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 1dd6af0ecfb99e8dbf70d7ddde83dd98654287aa62ad7b82b07af8b8f4202168
MD5 fc4783f4bba7378b0269886ca44e1df0
BLAKE2b-256 69c3bb77434cfed7501b1a98abebe57a255d4cb527eaddcaf47b26b880d2ab8f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page