Skip to main content

Framework for testing SQL queries

Project description

sql-test-kit

This is a framework for testing SQL queries. It works by directly running the queries against the targeted engine, thus being robust to any change in the corresponding SQL dialect. Moreover, it is currently focused on interpolating test data directly inside the SQL queries, making the test much quicker than if it was creating actual tables.

Installation

This package is available on Pypi, so you can use your favorite dependency managment tool to install it. For example :

  • with pip :
pip install sql-test-kit
  • with poetry :
poetry add sql-test-kit

Usage example

sql-test-kit is currently available for most SQL engines (Postgres, Redshift, Snowflakes...), and is particularly useful for those where no framework exists for locally testing SQL queries.

Nevertheless, a specific implementation has been written for BigQuery, in order to facilitate Table object initialization, as well as null values interpolation in tests.

Here is a simple example of instantiating a Table object and a SQL query for BigQuery :

from sql_test_kit import BigqueryTable, Column


sales_amount_col = "SALES_AMOUNT"
sales_date_col = "SALES_DATE"
sales_table = BigqueryTable(
    project="project",
    dataset="dataset",
    table="table",
    columns=[
        Column(sales_amount_col, "FLOAT64"),
        Column(sales_date_col, "STRING"),
    ],
)
current_year_sales_by_day_query = f"""
    SELECT {sales_date_col}, SUM({sales_amount_col}) AS {sales_amount_col}
    FROM {sales_table}
    WHERE {sales_date_col} >= "2023-01-01"
    GROUP BY {sales_date_col}
    """

You can then test it this way :

import pandas as pd
from google.cloud.bigquery import Client

from sql_test_kit import QueryInterpolator


def test_current_year_sales_by_day_query():
    # Given
    sales_data = pd.DataFrame(
        {
            "SALES_ID": [1, 2, 3, 4],
            sales_date_col: ["2022-12-31", "2023-01-01", "2023-01-01", "2023-01-02"],
            sales_amount_col: [10, 20, 30, 40],
        }
    )

    # When
    interpolated_query = QueryInterpolator() \
        .add_input_table(sales_table, sales_data) \
        .interpolate_query(current_year_sales_by_day_query)
    current_year_sales_by_day_data = Client().query(interpolated_query).to_dataframe()

    # Then
    expected_current_year_sales_by_day_data = pd.DataFrame(
        {
            sales_date_col: ["2023-01-01", "2023-01-02"],
            sales_amount_col: [50, 40],
        }
    )

    pd.testing.assert_frame_equal(
        current_year_sales_by_day_data,
        expected_current_year_sales_by_day_data,
        check_dtype=False,
    )

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sql_test_kit-0.4.0.tar.gz (8.4 kB view hashes)

Uploaded source

Built Distribution

sql_test_kit-0.4.0-py3-none-any.whl (9.7 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page