BigQuery ML Utils
Project description
BigQuery ML Utils
BigQuery ML (aka. BQML) lets you create and execute machine learning models in BigQuery using standard SQL queries. The BigQuery ML Utils library is an integrated suite of machine learning tools for building and using BigQuery ML models.
Installation
Install this library in a virtualenv using pip. virtualenv is a tool to create isolated Python environments. The basic problem it addresses is one of dependencies and versions, and indirectly permissions.
With virtualenv, it's possible to install this library without needing system install permissions, and without clashing with the installed system dependencies.
Mac/Linux
pip install virtualenv
virtualenv <your-env>
source <your-env>/bin/activate
<your-env>/bin/pip install bigquery-ml-utils
Windows
pip install virtualenv
virtualenv <your-env>
<your-env>\Scripts\activate
<your-env>\Scripts\pip.exe install bigquery-ml-utils
Overview
Inference
Transform Predictor
The Transform Predictor feeds input data into the BQML model trained with TRANSFORM. It performs both preprocessing and postprocessing on the input and output. The first argument is a SavedModel which represents the TRANSFORM clause for feature preprocessing. The second argument is a SavedModel or XGBoost Booster which represents the model logic.
XGBoost Predictor
The XGBoost Predictor feeds input data into the BQML XGBoost model. It performs both preprocessing and postprocessing on the input and output. The first argument is a XGBoost Booster which represents the model logic. The following arguments are model assets.
Tensorflow Ops
BQML Tensorflow Custom Ops provides SQL functions (Date functions,
Datetime functions,
Time functions
and Timestamp functions)
that are not available in TensorFlow. The implementation and function behavior
align with the BigQuery. This is part of an
effort to bridge the gap between the SQL community and the Tensorflow community.
The following example returns the same result as TIMESTAMP_ADD(timestamp_expression, INTERVAL int64_expression date_part)
>>> timestamp = tf.constant(['2008-12-25 15:30:00+00', '2023-11-11 14:30:00+00'], dtype=tf.string)
>>> interval = tf.constant([200, 300], dtype=tf.int64)
>>> result = timestamp_ops.timestamp_add(timestamp, interval, 'MINUTE')
tf.Tensor([b'2008-12-25 18:50:00.0 +0000' b'2023-11-11 19:30:00.0 +0000'], shape=(2,), dtype=string)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for bigquery_ml_utils-0.1.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a499c8de9d9e149a1644ed056fc7315132c702f8fcd0ea8985d5b41f3e6e65e1 |
|
MD5 | 2c71cc7e380df96b989235578aede8d2 |
|
BLAKE2b-256 | c877511e9d0d2f7da0d7f60c70801682fdd496d9f4c77a5c055fc24600e35a07 |
Hashes for bigquery_ml_utils-0.1.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9743fd057c8c40a752ddb6d30c5719c419cccb2c3923b77f06fd17bee91b782c |
|
MD5 | 2fa22c5506f003f20a1dc51fc026a67d |
|
BLAKE2b-256 | 5ba8830c1ae9093ef8de0196b2104b03e0001a297a2e4467f70d0c9637c52f6e |
Hashes for bigquery_ml_utils-0.1.11-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c62fa6304b207cb5cc82791c8aff8efa879441e23329bf75857c183b320b1ca |
|
MD5 | f16dd0001905951759fe1743460f3cda |
|
BLAKE2b-256 | b70ee9551e15653d2ad54d3f7377cf3d6068ad935518942a94e51dd40b30613d |
Hashes for bigquery_ml_utils-0.1.11-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 74f4e0c300a5f3fb3cae737c4a5d1be68f244c77da7c3622e8b89601de21db4f |
|
MD5 | 66b9d4af448a5dc65fdb043fcfd1d83a |
|
BLAKE2b-256 | 1f885b3d148e87c747254f2f126eaa7160b9d05a017c2aab47c14bba8e19315f |