Skip to main content

YDB dialect plugin for sqlglot

Project description

ydb-sqlglot-plugin

YDB dialect plugin for sqlglot — transpiles SQL from any dialect into YDB/YQL.

Installation

pip install ydb-sqlglot-plugin

Usage

After installing the package, the ydb dialect is available in sqlglot automatically — no extra imports needed:

import sqlglot

# Transpile from any dialect
result = sqlglot.transpile("SELECT * FROM users WHERE id = 1", read="mysql", write="ydb")[0]
# → SELECT * FROM `users` WHERE id = 1

# Or parse first, then generate
query = "SELECT * FROM orders WHERE user_id = 1"
parsed = sqlglot.parse_one(query, dialect="postgres")
yql = parsed.sql(dialect="ydb")

What the plugin does

Table names

Database-qualified names are rewritten to the YDB path format and wrapped in backticks:

-- input
SELECT * FROM analytics.events

-- output
SELECT * FROM `analytics/events`

CTEs → YDB variables

-- input
WITH active AS (SELECT * FROM users WHERE status = 'active')
SELECT * FROM active

-- output
$active = (SELECT * FROM `users` WHERE status = 'active');

SELECT * FROM $active AS active

Subquery decorrelation

Correlated subqueries (which YQL does not support) are rewritten as JOINs:

-- input
SELECT id, (SELECT MAX(amount) FROM orders WHERE orders.user_id = users.id) AS max_order
FROM users

-- output
SELECT users.id AS id, _u_0._u_2 AS max_order
FROM `users`
LEFT JOIN (
    SELECT MAX(amount) AS _u_2, user_id AS _u_1
    FROM `orders`
    WHERE TRUE
    GROUP BY user_id AS _u_1
) AS _u_0 ON users.id = _u_0._u_1

The same rewriting applies to EXISTS, IN (subquery), and ANY/ALL subqueries.


Function reference

Functions below are recognized by sqlglot as standard SQL expressions and translated to their YQL equivalents. Dialect-specific functions that sqlglot does not parse into typed AST nodes are passed through unchanged — see Limitations.

Date / time

Input YQL output
DATE_TRUNC('day', x) DATE(x)
DATE_TRUNC('week', x) DateTime::MakeDate(DateTime::StartOfWeek(x))
DATE_TRUNC('month', x) DateTime::MakeDate(DateTime::StartOfMonth(x))
DATE_TRUNC('quarter', x) DateTime::MakeDate(DateTime::StartOfQuarter(x))
DATE_TRUNC('year', x) DateTime::MakeDate(DateTime::StartOfYear(x))
EXTRACT(WEEK FROM x) DateTime::GetWeekOfYear(x)
EXTRACT(MONTH FROM x) DateTime::GetMonth(x)
EXTRACT(YEAR FROM x) DateTime::GetYear(x)
CURRENT_TIMESTAMP CurrentUtcTimestamp()
STR_TO_DATE(str, fmt) / TO_DATE(str, fmt) DateTime::MakeTimestamp(DateTime::Parse(fmt)(str))
DATE_ADD(x, INTERVAL n MONTH) DateTime::MakeDate(DateTime::ShiftMonths(x, n))
DATE_ADD(x, INTERVAL n YEAR) DateTime::MakeDate(DateTime::ShiftYears(x, n))
DATE_ADD(x, INTERVAL n DAY) x + DateTime::IntervalFromDays(n)
DATE_ADD(x, INTERVAL n HOUR) x + DateTime::IntervalFromHours(n)
DATE_ADD(x, INTERVAL n MINUTE) x + DateTime::IntervalFromMinutes(n)
DATE_ADD(x, INTERVAL n SECOND) x + DateTime::IntervalFromSeconds(n)
DATE_SUB(x, INTERVAL n ...) same as DATE_ADD with
INTERVAL n DAY (literal) DateTime::IntervalFromDays(n)
INTERVAL n HOUR (literal) DateTime::IntervalFromHours(n)
INTERVAL n MINUTE (literal) DateTime::IntervalFromMinutes(n)
INTERVAL n SECOND (literal) DateTime::IntervalFromSeconds(n)
dateDiff('minute', a, b) (CAST(b AS Int64) - CAST(a AS Int64)) / 60000000
dateDiff('hour', a, b) (CAST(b AS Int64) - CAST(a AS Int64)) / 3600000000
dateDiff('day', a, b) (CAST(b AS Int64) - CAST(a AS Int64)) / 86400000000
dateDiff('week', a, b) (CAST(b AS Int64) - CAST(a AS Int64)) / 604800000000

Note on dateDiff: YDB stores Timestamp as microseconds since epoch. The formula above gives exact integer units assuming both arguments are Timestamp. Results for Date-typed columns will differ.

Strings

Input YQL output
CONCAT(a, b, ...) a || b || ...
UPPER(x) Unicode::ToUpper(x)
LOWER(x) Unicode::ToLower(x)
LENGTH(x) / CHAR_LENGTH(x) Unicode::GetLength(x)
POSITION(sub IN x) / STRPOS(x, sub) Find(x, sub)
STRING_TO_ARRAY(x, delim) String::SplitToList(x, delim)
ARRAY_TO_STRING(arr, delim) String::JoinFromList(arr, delim)

Arrays / collections

Input YQL output
ARRAY(v1, v2, ...) AsList(v1, v2, ...)
ARRAY_LENGTH(x) / ARRAY_SIZE(x) ListLength(x)
ARRAY_FILTER(arr, x -> cond) ListFilter(arr, ($x) -> {RETURN cond})
ARRAY_ANY(arr, x -> cond) ListHasItems(ListFilter(arr, ($x) -> {RETURN cond}))
ARRAY_AGG(x) AGGREGATE_LIST(x)
UNNEST(x) FLATTEN BY x

Conditional / math

Input YQL output
NULLIF(x, y) IF(x = y, NULL, x)
ROUND(x, n) Math::Round(x, -n)
COUNT() (zero-argument form) COUNT(*)

JSON

Input YQL output
jsonb_col @> value (PostgreSQL) Yson::Contains(jsonb_col, value)

Type mapping

SQL type YQL type
TINYINT INT8
SMALLINT INT16
INT / INTEGER INT32
BIGINT INT64
FLOAT Float
DOUBLE / DOUBLE PRECISION Double
DECIMAL(p, s) Decimal(p, s)
BOOLEAN / BIT Uint8
TIMESTAMP Timestamp
VARCHAR / NVARCHAR / CHAR Utf8
TEXT / TINYTEXT / MEDIUMTEXT / LONGTEXT Utf8
BLOB / TINYBLOB / MEDIUMBLOB / LONGBLOB / VARBINARY String

Limitations

Correlated subqueries in DML

Correlated subqueries inside UPDATE or INSERT statements cannot be automatically decorrelated — YDB does not support them natively, and rewriting requires knowledge of the table's primary key. Rewrite manually using a $variable:

-- not supported (will raise an error)
UPDATE t SET col = (SELECT val FROM other WHERE other.id = t.id)

-- workaround
$vals = (SELECT id, val FROM other);
UPDATE t SET col = (SELECT val FROM $vals WHERE id = t.id)

Correlated subqueries inside SELECT are handled automatically via JOIN rewriting.

dateDiff with month granularity

dateDiff('month', a, b) has no exact equivalent in YDB because months have variable length. Use DateTime::ShiftMonths for date arithmetic instead.


Development

git clone https://github.com/ydb-platform/ydb-sqlglot-plugin.git
cd ydb-sqlglot-plugin
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
python -m pytest tests/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ydb_sqlglot_plugin-0.1.1.tar.gz (33.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ydb_sqlglot_plugin-0.1.1-py3-none-any.whl (31.5 kB view details)

Uploaded Python 3

File details

Details for the file ydb_sqlglot_plugin-0.1.1.tar.gz.

File metadata

  • Download URL: ydb_sqlglot_plugin-0.1.1.tar.gz
  • Upload date:
  • Size: 33.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ydb_sqlglot_plugin-0.1.1.tar.gz
Algorithm Hash digest
SHA256 672f5460bd72966e26ab281ab7c9b6b686b50d08225441fb0a2cbf369a7e6204
MD5 e2754e36f5dfda6e18fa8c99ab9b4372
BLAKE2b-256 389fb98a2e9af79fbd79ad72dd10b3f63e5608a7dc6c789606801a31cd1dcca9

See more details on using hashes here.

Provenance

The following attestation bundles were made for ydb_sqlglot_plugin-0.1.1.tar.gz:

Publisher: python-publish.yml on ydb-platform/ydb-sqlglot-plugin

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ydb_sqlglot_plugin-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for ydb_sqlglot_plugin-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 81da5967ea7fac2844b8f2547f5c56ed7e268938409e5a4271ea4fd095801e57
MD5 80dff8c5c7a96a75beb5e0d68f1fcb14
BLAKE2b-256 a4ca1c16390bbfb717f708004637b656922621a38049314bbe07e4804202e5d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for ydb_sqlglot_plugin-0.1.1-py3-none-any.whl:

Publisher: python-publish.yml on ydb-platform/ydb-sqlglot-plugin

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page