Skip to main content

Wrapper around Google Cloud Platform BigQuery client to simplify management of the BigQuery tables.

Project description

GitHub GitHub Workflow Status (with branch) Coverage PyPI - Downloads PyPI

Introduction

This package is a wrapper around the Google Cloud BigQuery API to simplify management of the BigQuery tables. The specification of the BigQuery tables is realized with the help of YAML file and the package provides the functionality to:

  • instantiate BigQuery table based on YAML specification
  • create BigQuery based on created instance
  • drop BigQuery table
  • truncate BigQuery table
  • load data into BigQuery table from blob in GCS bucket

Installation

pip install surquest-GCP-bq-grid

Getting started

Let's assume that we have a YAML file that specifies BigQuery table as follows:

name: users
desc: Table with all users
labels:
  company: surquest
  application: data-services
clustering_fields:
  - department
time_partitioning:
  field: created_at
  type: DAY
schema:
  - name: id
    desc: ID of the user
    mode: required
    type: INTEGER
  - name: name
    desc: First name and last name of the user
    mode: required
  - name: department
    desc: Description of the user
  - name: height
    desc: Height of the user in centimeters
    type: FLOAT
  - name: roles
    desc: List of roles of the user
    type: STRUCT
    mode: repeated
    fields:
      - name: role
        desc: Role of the user
        mode: required
      - name: description
        desc: Description of the role
  - name: last_login_at
    desc: Date and time when the user last logged in
    type: TIMESTAMP
    mode: REQUIRED
  - name: created_at
    desc: Date and time when the user was created
    type: TIMESTAMP
    mode: NULLABLE
    defaultValueExpression: CURRENT_TIMESTAMP()
  - name: created_by
    desc: User who created the user
    type: STRING
    mode: NULLABLE
    defaultValueExpression: SESSION_USER()
  - name: is_active
    desc: Indicates if the user record is active
    type: BOOLEAN
    mode: NULLABLE
    defaultValueExpression: true

Please note:

  • the default type of the column is string
  • the default mode is nullable
  • the desc is optional and can be omitted

More details about field specification can be fond here: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#TableFieldSchema.

Table specified by the above YAML file can be created with the following code:

from surquest.GCP.bq.grid import Grid

# create instance of the Grid
grid = Grid.from_yaml(
    path="path/to/the/yaml/file",
    dataset="dataset_name"
)
grid.exit() # check if table exists
grid.create() # create table in BigQuery
grid.load(
    blob_uri="gs://bucket_name/blob_name",
    mode="WRITE_TRUNCATE",
    format="CSV"
) # load data into BigQuery table from blob in GCS bucket
grid.truncate() # truncate table in BigQuery
grid.drop() # drop table in BigQuery

Following python script create BigQuery table as shown in following screenshot:

BigQuery Table

Local development

You are more than welcome to contribute to this project. To make your start easier we have prepared a docker image with all the necessary tools to run it as interpreter for Pycharm or to run tests.

Build docker image

docker build `
     --tag surquest/gcp/bq/grid `
     --file package.base.dockerfile `
     --target test .

Run tests

docker run --rm -it `
 -v "${pwd}:/opt/project" `
 -e "GOOGLE_APPLICATION_CREDENTIALS=/opt/project/credentials/TEST/key.file.json" `
 -w "/opt/project/test" `
 surquest/gcp/bq/grid pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

surquest_gcp_bq_grid-0.0.1rc6.tar.gz (81.4 kB view details)

Uploaded Source

Built Distribution

surquest_gcp_bq_grid-0.0.1rc6-py2.py3-none-any.whl (6.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file surquest_gcp_bq_grid-0.0.1rc6.tar.gz.

File metadata

File hashes

Hashes for surquest_gcp_bq_grid-0.0.1rc6.tar.gz
Algorithm Hash digest
SHA256 9c2e0b1757cf2438c855a1695e26b1114afaa3bbcea2bef9c72b04d84a92eff3
MD5 821fe016b4e0e773daaf73f30e21b189
BLAKE2b-256 f833ddf8b54713bc533cdc249521936abfdcd719fdf8678e542f33891b97eb03

See more details on using hashes here.

File details

Details for the file surquest_gcp_bq_grid-0.0.1rc6-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for surquest_gcp_bq_grid-0.0.1rc6-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 fb7b7376ba9ddee37f981742cb6de63c813cf56452d6c24fc5e39290669857f4
MD5 02ffc1d8be991cb4c5461ec3e178dc0b
BLAKE2b-256 4a9f5fc3b270aa460b23daddf8acf64a95b57c87285943a1ce845e22e5687bd9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page