Easily integrate data in BigQuery
Project description
PyGBQ
Easily integrate data in BigQuery
Example
from pygbq import Client
import requests
client = Client()
token = client.get_secret('secret_name')
headers = {'Authorization': f'Bearer {token}'}
url = ...
data = requests.get(url, headers=headers).json()
response = client.update(data=data, table_id='mydataset.mytable', how=['id'])
This snippets gets some data from an url and (merges)[https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax#merge_statement] ((upserts)[https://en.wikipedia.org/wiki/Merge_(SQL)]) on id column it to the table mytable in the dataset mydataset.
Install and set up
pip install pygbq
Set up the authentication.
How it works
how=['column1', 'column2', ...]
PyGBQ generates one or many temporary tables that are merged into the target table. During the merge all the columns of the target table are updated. Here's how it looks like:
- Split
datainto batches. - For every batch create
mydataset.mytable_tmp_SOMERANDOMPOSTFIX, put it inside and run
MERGE myproject.mydataset.mytable T
USING myproject.mydataset.mytable_tmp_SOMERANDOMPOSTFIX S
ON T.column1 = S.column1 AND T.column2 = S.column2
WHEN NOT MATCHED THEN
INSERT ROW
WHEN MATCHED THEN
UPDATE SET
column1 = S.column1,
column2 = S.column2,
column3 = S.column3,
column4 = S.column4
...
how='replace'
- Creates a table
mydataset.mytablewith schema automatically generated bybigquery-schema-generator. - Splits data into batches and inserts it to
mydataset.mytable.
how='fail'
Identical to how='replace' except that it fails if mydataset.mytable exists.
how='insert'
Splits data into batches and inserts (appends) it to mydataset.mytable.
For more details look at Documentation section.
Documentation
Here's the documentation with default parameters.
Client
init
from pygbq import Client
client = Client(default_dataset=None, path_to_key=None)
Initalizes a client. You can specify:
default_dataset- (str) default dataset that theclientwill be using to reference tablespath_to_key- (str) By default PyGQB usesfrom google.auth import defaultto get credentials, but you can specify this parameter if you wish to usefrom google.auth import load_credentials_from_fileinstead.
update
client.update(data, table_id, how, schema: Union[str, List[dict]] = None, expiration=1, max_insert_num_rows=4000)
Updates table.
data- list of dicttable_id- (str) Table id, could have one of the following forms:table_nameifdefault_datasetis setdataset_name.table_nameproject_id.dataset_name.table_name
how- (str or List[dict]) Look at How it works sectionexpiration- (float) temporary tables expiration time in hoursmax_insert_num_rows- (int) how many rows per temporary table is inserted
query
client.query(query)
Execute a query in BigQuery.
query- (str) BigQuery query
get_secret
client.get_secret(self, secret_id, version="latest")
Get a secret stored in Secret Manager.
secret_id- (str) Secret nameversion- Secret version
add_secret
client.get_secret(self, secret_id, version="latest")
Adds a new secret version in Secret Manager.
secret_id- (str) Secret namedata- (str) Secret value
read_jsonl
from pygbq import read_jsonl
read_jsonl(name: str = "data.jsonl")
Reads a new line delimited json.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pygbq-0.26.tar.gz.
File metadata
- Download URL: pygbq-0.26.tar.gz
- Upload date:
- Size: 11.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39acb52e1f2341ceca2cd3736fed834dd12d2b91c9fb4cc7f1beba4660067b88
|
|
| MD5 |
fe4009e5407976f9397f308afeb7fb33
|
|
| BLAKE2b-256 |
3f9802c1bdde58c7fd8102acfd468b27f56986f23bc9388e5f0d8bf9ee6b8a78
|
File details
Details for the file pygbq-0.26-py3-none-any.whl.
File metadata
- Download URL: pygbq-0.26-py3-none-any.whl
- Upload date:
- Size: 11.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
466452ba80a55ec053d4bc3d546b2c1da13b86c89268cc6b3e422d0c4d989eb2
|
|
| MD5 |
27a01e5885562934bdca142e5e3ca64f
|
|
| BLAKE2b-256 |
4ab08337002fdb312959cfdf44fada11ed32494f6bde1ef882de985273970082
|