Skip to main content

Integrate with DuckDB's JSON serialization of expressions and values

Project description

query-farm-duckdb-json-serialization

This Python module provides a Pydantic parser for DuckDB expressions that have been serialized to JSON by the Airport DuckDB extension.

These expressions are used by Apache Arrow Flight servers to perform predicate pushdown — enabling the server to filter rows efficiently before sending data to the client.

Purpose

The module's primary function is to:

  • Parse DuckDB expressions serialized as JSON.
  • Optionally convert the parsed expressions back into SQL.
  • Allow server-side row filtering using DuckDB, before returning data via Arrow Flight.

Note: The JSON format used by Airport differs from the built-in DuckDB JSON serialization. Specifically, binary values are encoded using Base64 in Airport for UTF-8 compatibility.


Installation

pip install query-farm-duckdb-json-serialization

API Usage

from query_farm_duckdb_json_serialization.expression import Expression

column_names_by_index = ['first_name', 'last_name']
# If there are multiple expressions passed, these will all
# be logically joined with an AND operator.
#
# The DuckDB data typestypes of the columns bound by the expressions
# will be returned.
sql, bound_types = Expression.convert_to_sql(
    source=expressions,
    bound_column_names=column_names_by_index
)
  • expressions: JSON-serialized list of DuckDB expression trees.
  • bound_column_names: Column names indexed as expected by DuckDB.
  • sql: Reconstructed SQL WHERE clause.
  • bound_types: List of DuckDB data types for the bound columns.

Input

The structure of DuckDB's serialized expressions may change between versions. Below is a working example.

CREATE TABLE test_type_int64 (v int64);
INSERT INTO test_type_int64 values (1234567890123456789);

-- This statement will generate the following JSON serialization.
SELECT v FROM test_type_int64 WHERE v = 1234567890123456789;
[
  {
    "expression_class": "BOUND_COMPARISON",
    "type": "COMPARE_EQUAL",
    "alias": "",
    "query_location": 18446744073709551615,
    "left": {
      "expression_class": "BOUND_COLUMN_REF",
      "type": "BOUND_COLUMN_REF",
      "alias": "v",
      "query_location": 18446744073709551615,
      "return_type": {
        "id": "BIGINT",
        "type_info": null
      },
      "binding": {
        "table_index": 0,
        "column_index": 0
      },
      "depth": 0
    },
    "right": {
      "expression_class": "BOUND_CONSTANT",
      "type": "VALUE_CONSTANT",
      "alias": "",
      "query_location": 18446744073709551615,
      "value": {
        "type": {
          "id": "BIGINT",
          "type_info": null
        },
        "is_null": false,
        "value": 1234567890123456789
      }
    }
  }
]

Author

This Python module was created by Query.Farm.

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

query_farm_duckdb_json_serialization-0.1.4.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file query_farm_duckdb_json_serialization-0.1.4.tar.gz.

File metadata

File hashes

Hashes for query_farm_duckdb_json_serialization-0.1.4.tar.gz
Algorithm Hash digest
SHA256 fb7750764ee8cd2d29d380130a1264bd6b3e7729e6a516b395e88ffc2bcca3ea
MD5 e7ea798edb10445a7ff42e26be8404c7
BLAKE2b-256 93352b84db561313b2e29468753488a3bd792afc311e3210dbecd2571266bd46

See more details on using hashes here.

File details

Details for the file query_farm_duckdb_json_serialization-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for query_farm_duckdb_json_serialization-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 7d9e27e8000ab5a437b52819fa2e8628a9a578537bcb295af7ab77d32234b244
MD5 bddf587b0e4f569525dff29fc82beeec
BLAKE2b-256 d99ff8386f8f155cc15cc24f573e307710e620872e29c4bd446aaec76db19bb8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page