Skip to main content

Analytics Query Analyzer

Project description

analytics-query-analyzer

Analyze analytics SQL to extract referenced tables/columns and time bounds.

Install

Base install:

pip install analytics-query-analyzer

BigQuery extras (only needed for build_schema):

pip install analytics-query-analyzer[bigquery]

Support

BigQuery is currently supported and tested.

Schema format

Schemas follow sqlglot conventions, with nested fields represented as nested dicts.

Usage

analyze

Extract table/column references from a query.

from analytics_query_analyzer import analyze
from sqlglot import dialects

schema = {
    "production": {
        "shop": {
            "orders": {
                "id": "int64",
                "ordered_at": "datetime",
                "user_id": "int64",
                "payment": {
                    "amount": "int64",
                    "method": "string",
                },
            },
        },
    },
}

sql = """
select
    id,
    user_id,
    payment.amount
from
    shop.orders
"""

references = analyze(dialects.BigQuery, sql, schema, "production")
print(references)
# {"production.shop.orders": {"id", "user_id", "payment.amount"}}

analyze_timespan

Extract time bounds from filters.

from analytics_query_analyzer import analyze_timespan
from sqlglot import dialects

schema = {
    "production": {
        "shop": {
            "orders": {
                "id": "int64",
                "ordered_at": "datetime",
                "user_id": "int64",
            },
        },
    },
}

sql = """
select
    *
from
    shop.orders
where
    ordered_at >= "2025-01-01"
    and ordered_at < "2026-01-01"
"""

timespans = analyze_timespan(dialects.BigQuery, sql, schema, "production")
print(timespans)
# {
#   "production.shop.orders.ordered_at": {
#     "lower": "2025-01-01",
#     "upper": "2026-01-01"
#   }
# }

To make current_date() deterministic, pass a provider:

timespans = analyze_timespan(
    dialects.BigQuery,
    "select * from shop.orders where ordered_at >= current_date()",
    schema,
    "production",
    current_date_provider=lambda: "2026-01-01",
)

build_schema

Fetch a schema dictionary from BigQuery.

from analytics_query_analyzer import build_schema
from sqlglot import dialects

schema = build_schema(dialects.BigQuery, "my_project", "my_schema", "my_table")
print(schema)
  • Authentication uses Application Default Credentials (ADC).
  • Only BigQuery is supported for build_schema.
  • When table is omitted, it scans all tables in the dataset.
  • When both dataset and table are omitted, it scans all datasets in the project.
  • The returned schema can be passed directly to analyze and analyze_timespan.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

analytics_query_analyzer-0.2.0.tar.gz (14.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

analytics_query_analyzer-0.2.0-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file analytics_query_analyzer-0.2.0.tar.gz.

File metadata

  • Download URL: analytics_query_analyzer-0.2.0.tar.gz
  • Upload date:
  • Size: 14.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for analytics_query_analyzer-0.2.0.tar.gz
Algorithm Hash digest
SHA256 273e0d356014e7db472995aca79b199f733d736f20ddfdb2c51837b17bbeb719
MD5 d9df68d14d03de4c48f7a5a905985ad7
BLAKE2b-256 392383941b75c6b754a3963fdfbf2a134b33ce91680cf860f7fe1c473e0b2969

See more details on using hashes here.

Provenance

The following attestation bundles were made for analytics_query_analyzer-0.2.0.tar.gz:

Publisher: publish-pypi.yml on logicoffee/aqa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file analytics_query_analyzer-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for analytics_query_analyzer-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 75da89b38e747747ffe630af63095593c475681f1b2b16c233f45d50c95ba06b
MD5 27aee255667d3e0bbd1ca75e63f8582c
BLAKE2b-256 54facfc715f3e9a5eaf40e660f70ac02cc318713ae0b22de63090b60ad76f89a

See more details on using hashes here.

Provenance

The following attestation bundles were made for analytics_query_analyzer-0.2.0-py3-none-any.whl:

Publisher: publish-pypi.yml on logicoffee/aqa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page