BigQuery Views Manager
Project description
BigQuery Views Manager
Utility project to maintain BigQuery views. The main interface is the CLI.
Main features:
- Synchronize BigQuery Views between GCP and local file system
- Materialize Views (by running a view and saving it to a table)
Pre-requisites
- Python 3
- Google Cloud SDK for gcloud
Install
pip install bigquery-views-manager
Configuration
Views SQL files
SQL code of the view queries is assumed to be in files with the .sql
files. By default they will be in the views
directory.
The files can contain placeholders (surrounded by curly brackets, e.g. {placeholder}
) for the following variables:
name | description |
---|---|
project | The GCP project |
dataset | The BigQuery dataset |
Using the placeholders allows you to deploy the views to for example test, staging and production separately.
Example:
SELECT *
FROM `{project}.{dataset}.view1`
views.lst
The views.lst
file contains the list of views that should be processed. It is important that the list of views are in the correct insert order. i.e. if v_view2
depends on v_view1
then v_view1
should appear first.
The format is: view_name[,optional_dataset_name]
Where the view_name
is the name of the view to be created and the optional_dataset_name
is the name of the dataset inside which the view should be created. If the optional_dataset_name
is not given, then, the default dataset specified in the --dataset
argument of the command will be used.
Example:
v_view1
v_view2
Or:
v_view1
v_view2,target_dataset_1
materialized-views.lst
The materialized-views.lst
file contains the list of views that should be materialized. By default it will materialize the view to a table with the m
prefix, in the same dataset as the view.
The format is view_name[,optional_dataset_name.optional_materialized_view_name]
Where the view_name
is the name of the view to be created and the optional_dataset_name
is the name of the dataset inside which the view should be created. if the optional_dataset_name
is not given, then, the default dataset specified in the --dataset
CLI argument of the command will be used. The optional_materialized_view_name
is the name of the materialized view to be created if specified, otherwise 'm' + view_name
.
Adding the --disable-view-name-mapping
CLI argument will disable the use mapping of views to dataset and materialized view specified in the views/views.lst
and views/materialized-views.lst
, and will used the default.
Example (only materialize v_view1
to mv_view1
):
v_view1
Or (only materialize v_view1
to target_table1
within the target_dataset_1
dataset):
view1,target_dataset_1.target_table1
Config Tables
Config tables are tables loaded from CSV. They are meant to assist views with configuration data, rather than loading large data. Config tables are generally used by views to avoid having to hard-code certain values in the views.
It is assumed that the filename is target table name with the .csv
file extension. By default in the ./config-tables/tables
directory. A BigQuery table schema can be specified via a file with the _schema.json
in the ./config-tables/schema
directory.
Another directory can be specified via the --config-tables-base-dir
CLI argument.
Example Data
See example-data.
BigQuery Views Manager CLI
To get the command help:
python -m bigquery_views_manager --help
Or:
python -m bigquery_views_manager <sub-command> --help
Create or Replace Views
python -m bigquery_views_manager \
create-or-replace-views \
--dataset=my_dataset \
[--view-list-file=/path/to/views.lst] \
[<view name> [<other view name> ...]]
Adding the --materialize
flag will additionally materialize the views that are in the views/materialized-views.lst
file. In that case views will be materialized immediately after updating a view.
Materialize Views
python -m bigquery_views_manager \
materialize-views \
--dataset=my_dataset \
[--view-list-file=/path/to/views.lst] \
[--materialized-view-list-file=/path/to/materialized-views.lst] \
[<view name> [<other view name> ...]]
Diff Views
Show differences between local views and views within BigQuery.
python -m bigquery_views_manager \
diff-views \
--dataset=my_dataset \
[--view-list-file=/path/to/views.lst] \
[<view name> [<other view name> ...]]
Get Views
Copy views from BigQuery to the local file system.
To get all of the files listed in views/views.lst
:
python -m bigquery_views_manager \
get-views \
--dataset=my_dataset \
[--view-list-file=/path/to/views.lst]
To get a particular view or views:
python -m bigquery_views_manager \
get-views \
--dataset=my_dataset \
[--view-list-file=/path/to/views.lst] \
<view name> [<other view name> ...]
When views are retrieved, the project name and dataset are replaced with placeholders.
Update Config Tables
Copy config tables (CSV) to BigQuery. The config tables are by default stored in ./config-tables
.
python -m bigquery_views_manager \
create-or-replace-config-tables \
--dataset=my_dataset \
[--config-tables-base-dir=/path/to/config-tables] \
[<table name> ...]
Adding a View
Add the view to the views
directory with the view name and .sql
file extension.
The view name also needs to be added to views/views.lst
in the correct order (i.e. if a view depends on another view, the other view should appear first).
Cleanup Sub Commands
The CLI also supports additional sub commands to delete views etc. Those are in particular use-ful in a CI environment.
The following commands are supported:
delete-config-tables
delete-views
delete-materialized-tables
Similar Projects
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for bigquery_views_manager-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52f2eb4b2239d43cb5fb367eaa78b64ef0386a25145afcbb4474dc38e81e9d5b |
|
MD5 | d4ad28f8f477bad818105a35bae39b92 |
|
BLAKE2b-256 | 9f477f389267d6a844163b76ed2816fd3b810220fa93f27079b53d43bf1b64ef |