Wrapper around Google Cloud Platform BigQuery client to simplify management of the BigQuery tables.
Project description
Introduction
This package is a wrapper around the Google Cloud BigQuery API to simplify management of the BigQuery tables. The specification of the BigQuery tables is realized with the help of YAML file and the package provides the functionality to:
- instantiate BigQuery table based on YAML specification
- create BigQuery based on created instance
- drop BigQuery table
- truncate BigQuery table
- load data into BigQuery table from blob in GCS bucket
Installation
pip install surquest-GCP-bq-grid
Getting started
Let's assume that we have a YAML file that specifies BigQuery table as follows:
name: users
desc: Table with all users
labels:
company: surquest
application: data-services
clustering_fields:
- department
time_partitioning:
field: created_at
type: DAY
schema:
- name: id
desc: ID of the user
mode: required
type: INTEGER
- name: name
desc: First name and last name of the user
mode: required
- name: department
desc: Description of the user
- name: height
desc: Height of the user in centimeters
type: FLOAT
- name: roles
desc: List of roles of the user
type: STRUCT
mode: repeated
fields:
- name: role
desc: Role of the user
mode: required
- name: description
desc: Description of the role
- name: last_login_at
desc: Date and time when the user last logged in
type: TIMESTAMP
mode: REQUIRED
- name: created_at
desc: Date and time when the user was created
type: TIMESTAMP
mode: NULLABLE
defaultValueExpression: CURRENT_TIMESTAMP()
- name: created_by
desc: User who created the user
type: STRING
mode: NULLABLE
defaultValueExpression: SESSION_USER()
- name: is_active
desc: Indicates if the user record is active
type: BOOLEAN
mode: NULLABLE
defaultValueExpression: true
Please note:
- the default
type
of the column isstring
- the default
mode
isnullable
- the
desc
is optional and can be omitted
More details about field specification can be fond here: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#TableFieldSchema.
Table specified by the above YAML file can be created with the following code:
from surquest.GCP.bq.grid import Grid
# create instance of the Grid
grid = Grid.from_yaml(
path="path/to/the/yaml/file",
dataset="dataset_name"
)
grid.exit() # check if table exists
grid.create() # create table in BigQuery
grid.load(
blob_uri="gs://bucket_name/blob_name",
mode="WRITE_TRUNCATE",
format="CSV"
) # load data into BigQuery table from blob in GCS bucket
grid.truncate() # truncate table in BigQuery
grid.drop() # drop table in BigQuery
Following python script create BigQuery table as shown in following screenshot:
Local development
You are more than welcome to contribute to this project. To make your start easier we have prepared a docker image with all the necessary tools to run it as interpreter for Pycharm or to run tests.
Build docker image
docker build `
--tag surquest/gcp/bq/grid `
--file package.base.dockerfile `
--target test .
Run tests
docker run --rm -it `
-v "${pwd}:/opt/project" `
-e "GOOGLE_APPLICATION_CREDENTIALS=/opt/project/credentials/TEST/key.file.json" `
-w "/opt/project/test" `
surquest/gcp/bq/grid pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file surquest_gcp_bq_grid-0.0.1rc7.tar.gz
.
File metadata
- Download URL: surquest_gcp_bq_grid-0.0.1rc7.tar.gz
- Upload date:
- Size: 81.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | af8af20d7c289a44cc2ffb46a516f364fe9e205de4da69a96353ba5d565d4169 |
|
MD5 | 7223cd4a54974830e190e0149b529d2c |
|
BLAKE2b-256 | 63214169b1a60da15d63af7b7f5fe2a657dad4743a7ddd090d25303dd6ca90d5 |
File details
Details for the file surquest_gcp_bq_grid-0.0.1rc7-py2.py3-none-any.whl
.
File metadata
- Download URL: surquest_gcp_bq_grid-0.0.1rc7-py2.py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a36d6782369af2067f983b6bc40543995d1cd7fc62517a872dbc4dac3e9abc37 |
|
MD5 | 5fb05673eace4cf1b6ce24139ff77240 |
|
BLAKE2b-256 | bdf68b0cf36d0eb962a8ba7f60e6fc7e1dd1ead6504294710ac4267261ce731e |