Effortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.
Project description
BQuest
Effortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.
We would like to thank Mike Czech who is the original inventor of bquest!
Warning
This library is a work in progress!
Breaking changes should be expected until a 1.0 release, so version pinning is recommended.
Overview
Use BQuest in combination with your favorite testing framework (e.g. pytest).
Create temporary test tables from JSON or pandas DataFrame.
Run BQ configurations and plain SQL queries on your test tables and check the result.
Installation
Via PyPi (standard):
pip install bquest
Via Github (most recent):
pip install git+https://github.com/ottogroup/bquest
BQuest also requires a dedicated BigQuery dataset for storing test tables, e.g.
resource "google_bigquery_dataset" "bquest" {
dataset_id = "bquest"
friendly_name = "bquest"
description = "Source tables for bquest tests"
location = "EU"
default_table_expiration_ms = 3600000
}
We recommend setting an expiration time for tables in the bquest dataset to assure removal of those test tables upon test execution.
Example
Given a pandas DataFrame
foo |
weight |
prediction_date |
---|---|---|
bar |
23 |
20190301 |
my |
42 |
20190301 |
and its table definition
from bquest.tables import BQTableDefinitionBuilder
table_def_builder = BQTableDefinitionBuilder(GOOGLE_PROJECT_ID, dataset="bquest", location="EU")
table_definition = table_def_builder.from_df("abc.feed_latest", df)
you can use the config file ./abc/config.py
{
"query": """
SELECT
foo,
PARSE_DATE('%Y%m%d', prediction_date)
FROM
`{source_table}`
WHERE
weight > {THRESHOLD}
""",
"start_date": "prediction_date",
"end_date": "prediction_date",
"source_tables": {"source_table": "abc.feed_latest"},
"feature_table_name": "abc.myid",
}
and the runner
from bquest.runner import BQConfigFileRunner, BQConfigRunner
runner = BQConfigFileRunner(
BQConfigRunner(bq_client, bq_executor_func),
"config/bq_config",
)
result_df = runner.run_config(
"20190301",
"20190308",
[table_definition],
"abc/config.py",
templating_vars={"THRESHOLD": "30"},
)
to assert the result table
assert result_df.shape == (1, 2)
assert result_df.iloc[0]["foo"] == "my"
Testing
For the actual testing bquest relies on an accessible BigQuery project which can be configured with the gcloud client. The corresponding GOOGLE_PROJECT_ID is extracted from this project and used with pandas-gbq to write temporary tables to the bquest dataset that has to be pre- configured before testing on that project.
For Github CI we have configured an identity provider in our testing project which allows only core members of this repository to access the testing projects’ resources.
Important Links
Full documentation: https://ottogroup.github.io/bquest/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bquest-0.5.1.tar.gz
.
File metadata
- Download URL: bquest-0.5.1.tar.gz
- Upload date:
- Size: 15.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-1021-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10b104028b64ff7b8de4852c014075b6e31bf3a970049a46ff9fbc7e0927db62 |
|
MD5 | 4ac49698615b6af7bafae14dc3d0b529 |
|
BLAKE2b-256 | 13c76bcf30bb34f2eb2e0712f16ce17f47f3d825ec68ca5f257dfa2f753c6e78 |
File details
Details for the file bquest-0.5.1-py3-none-any.whl
.
File metadata
- Download URL: bquest-0.5.1-py3-none-any.whl
- Upload date:
- Size: 18.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-1021-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ac0a27349e6d8804cd2f40b9bf2b5954339b0e4ae2d7c879647dc7af2d6b090 |
|
MD5 | 5d2990ac3eab48c5fb6cb30f1bb93b76 |
|
BLAKE2b-256 | b87be3872e82d66ed89112cbd80b8c10a9fdca308a0529427ec7e91a09e841b5 |