Google BigQuery target of singer.io framework.
Project description
target-bigquery
ANELEN's implementation of target-bigquery.
This is a "lab" stage project with limited documentatioin and support. For other open-source projects by Anelen, please see https://anelen.co/open-source.html
What it does
Extract data from BigQuery tables.
This is a Singer tap that produces JSON-formatted data following the Singer spec.
This tap:
- Pulls data from Google BigQuery tables/views with datetime field.
- Infers the schema for each resource and produce catalog file.
- Incrementally pulls data based on the input state.
Installation
Step 0: Acknowledge LICENSE and TERMS
Please especially note that the author(s) of target-bigquery is not responsible for the cost, including but not limited to BigQuery cost) incurred by running this program.
Step 1: Activate the Google BigQuery API
(originally found in the Google API docs)
- Use this wizard to create or select a project in the Google Developers Console and activate the BigQuery API. Click Continue, then Go to credentials.
- On the Add credentials to your project page, click the Cancel button.
- At the top of the page, select the OAuth consent screen tab. Select an Email address, enter a Product name if not already set, and click the Save button.
- Select the Credentials tab, click the Create credentials button and select OAuth client ID.
- Select the application type Other, enter the name "Singer BigQuery Tap", and click the Create button.
- Click OK to dismiss the resulting dialog.
- Click the Download button to the right of the client ID.
- Move this file to your working directory and rename it client_secrets.json.
Export the location of the secret file:
export GOOGLE_APPLICATION_CREDENTIALS="./client_secret.json"
For other authentication method, please see Authentication section.
Step 2: Install
First, make sure Python 3 is installed on your system or follow these installation instructions for Mac or Ubuntu.
This program has not yet released via pypi. So do this to install the relatively stable version from GitHub:
pip install --no-cache-dir https://github.com/anelendata/target-bigquery/archive/71b51aa8128d7b50a8155f6d9974308cd1d4c2d4.tar.gz#egg=target-bigquery
Note: 71b51aa8128d7b50a8155f6d9974308cd1d4c2d4
in the URL is the commit hash.
Or you can install the latest development version:
pip install --no-cache-dir https://github.com/anelendata/target-bigquery/archive/master.tar.gz#egg=target-bigquery
Run
Step 1: Configure
Create a file called target_config.json in your working directory, following config.sample.json:
{
"project_id": "your-gcp-project-id",
"dataset_id": "your-bigquery-dataset",
"table_id": "your-table-name",
"stream": false,
}
Notes:
stream
: Make this true to run the streaming updates to BigQuery. Note that performance of batch update is better when keeping this optionfalse
.- Optionally, you can define
"partition_by": <some-timestamp-column-name>
to create a partitioned table. Many production quailty taps implements a ingestion timestamp and it is recommended to use the column here to partition the table. It will increase the query performance and lower the BigQuery costs.
Step 2: Run
target-bigquery can be run with any Singer Target. As example, let use tap-exchangeratesapi.
pip install tap-exchangeratesapi
Run:
tap-exchangeratesapi | target-bigquery -c target_config.json
Authentication
It is recommended to use target-bigquery
with a service account.
- Download the client_secrets.json file for your service account, and place it
on the machine where
target-bigquery
will be executed. - Set a
GOOGLE_APPLICATION_CREDENTIALS
environment variable on the machine, where the value is the fully qualified path to client_secrets.json
In the testing environment, you can also manually authenticate before runnig
the tap. In this case you do not need GOOGLE_APPLICATION_CREDENTIALS
defined:
gcloud auth application-default login
You may also have to set the project:
gcloud config set project <project-id>
Though not tested, it should also be possible to use the OAuth flow to authenticate to GCP as well:
target-bigquery
will attempt to open a new window or tab in your default browser. If this fails, copy the URL from the console and manually open it in your browser.- If you are not already logged into your Google account, you will be prompted to log in.
- If you are logged into multiple Google accounts, you will be asked to select one account to use for the authorization.
- Click the Accept button to allow
target-bigquery
to access your Google BigQuery table. - You can close the tab after the signup flow is complete.
Original repo
https://github.com/anelendata/target-bigquery
Copyright © 2020- Anelen Data
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file target-bigquery-partition-0.1.0.tar.gz
.
File metadata
- Download URL: target-bigquery-partition-0.1.0.tar.gz
- Upload date:
- Size: 8.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d32ab2c3d59c1538b0fce73ad7d9cd8a7a51a4b18a003dae6a947e15755b73c |
|
MD5 | d41ff024e09efdda2a148ac604cc48a2 |
|
BLAKE2b-256 | 1230337e23e84138abbd87b192165cf4dc86edd16c644fde515584a4ce07302e |
File details
Details for the file target_bigquery_partition-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: target_bigquery_partition-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f393059cd6cf9c19ceda6b2333fdd893a8440cfee0846f4b9fb5917a6063395 |
|
MD5 | 1d1d4ffec31d958cf0c4e791d0945ea3 |
|
BLAKE2b-256 | 8b089471bb6a9a37c14196b99ad5020dc116eb88c3c62ed4058b35d354d1b663 |