Skip to main content

MaxCompute dialect plugin for SQLGlot

Project description

sqlglot-maxcompute

A SQLGlot dialect plugin for Alibaba Cloud MaxCompute (formerly ODPS).

Registers the maxcompute dialect via Python entry points so that SQLGlot can parse and generate MaxCompute SQL.

Installation

pip install sqlglot-maxcompute

Requires Python ≥ 3.9 and SQLGlot ≥ 30.1.

Usage

import sqlglot

# Parse MaxCompute SQL
ast = sqlglot.parse_one("SELECT DATEADD(dt, 1, 'DAY')", read="maxcompute")

# Transpile from another dialect to MaxCompute
sqlglot.transpile("SELECT DATE_ADD(dt, 1)", read="spark", write="maxcompute")
# ["SELECT DATEADD(dt, 1, 'DAY')"]

# Transpile from MaxCompute to another dialect
sqlglot.transpile("SELECT DATETRUNC(dt, 'MONTH')", read="maxcompute", write="spark")
# ["SELECT TRUNC(dt, 'MONTH')"]

# TO_DATE return type depends on args:
#   without format → DATE    (exp.TsOrDsToDate)
#   with format    → DATETIME (exp.StrToTime)
sqlglot.transpile("TO_DATE('20240101', 'yyyymmdd')", read="maxcompute", write="spark")
# ["TO_TIMESTAMP('20240101', 'yyyymmdd')"]

# Round-trip MaxCompute DDL
sqlglot.transpile(
    "CREATE TABLE t (id BIGINT) LIFECYCLE 30",
    read="maxcompute",
    write="maxcompute",
)
# ["CREATE TABLE t (id BIGINT) LIFECYCLE 30"]

What's supported

Parser — MaxCompute → canonical AST

Category Functions
Date arithmetic DATEADD, DATE_SUB, DATEDIFF, ADD_MONTHS, MONTHS_BETWEEN
Date extraction DATEPART, DATETRUNC, TRUNC_TIME, DAY, MONTH, YEAR, HOUR, MINUTE, SECOND, QUARTER, DAYOFMONTH, DAYOFWEEK, DAYOFYEAR, WEEKDAY, WEEKOFYEAR
Date conversion TO_DATE, DATE_FORMAT, TO_CHAR, FROM_UNIXTIME, FROM_UTC_TIMESTAMP, TO_MILLIS, ISDATE
Current date/time GETDATE, NOW, CURRENT_TIMESTAMP, CURRENT_TIMEZONE
Last/next day LAST_DAY, LASTDAY, NEXT_DAY
String TOLOWER, TOUPPER, REGEXP_COUNT, SPLIT_PART, SUBSTR
Aggregate WM_CONCAT, COUNT_IF, ARG_MAX, ARG_MIN, MAX_BY, MIN_BY, ANY_VALUE, APPROX_DISTINCT, STDDEV_SAMP, COVAR_POP, COVAR_SAMP, CORR, MEDIAN, PERCENTILE_APPROX, BITWISE_AND_AGG, BITWISE_OR_AGG, BITWISE_XOR_AGG
Array ALL_MATCH, ANY_MATCH, ARRAY_SORT, ARRAY_DISTINCT, ARRAY_EXCEPT, ARRAY_JOIN, ARRAY_MAX, ARRAY_MIN, ARRAYS_OVERLAP, ARRAYS_ZIP, ARRAY_INTERSECT, ARRAY_POSITION, ARRAY_REMOVE, ARRAY_CONTAINS, SLICE
Map MAP_CONCAT, MAP_FROM_ENTRIES
JSON / misc FROM_JSON, GET_JSON_OBJECT, JSON_TUPLE, GET_USER_ID, REGEXP_SUBSTR, TO_MILLIS, ISDATE

Functions not listed are handled via Hive inheritance and work without explicit mapping (e.g. SPLIT, REGEXP_EXTRACT, COLLECT_LIST, PERCENTILE, all math/trig functions, window functions).

Generator — canonical AST → MaxCompute SQL

Explicit transforms on top of Hive:

Expression MaxCompute output Note
DATEADD / DATE_SUB DATEADD(dt, ±n, 'UNIT') Correct negation for DATE_SUB
DATEDIFF DATEDIFF(dt1, dt2[, unit])
DATETRUNC DATETRUNC(dt, 'unit') Week units: 'week(monday)' etc.
DATEPART DATEPART(dt, 'UNIT')
TO_DATE(str, fmt) TO_DATE(str, fmt) Maps to exp.StrToTime (DATETIME)
TO_DATE(str) TO_DATE(str) Maps to exp.TsOrDsToDate (DATE)
CurrentTimestamp GETDATE() Covers GETDATE, NOW, CURRENT_TIMESTAMP
CurrentDatetime NOW() For BigQuery-origin CURRENT_DATETIME
SPACE(n) SPACE(n) Hive emits REPEAT(' ', n)
VAR_POP(x) VAR_POP(x) Hive emits VARIANCE_POP
VAR_SAMP(x) VAR_SAMP(x) Hive emits VARIANCE
INSTR(str, sub) INSTR(str, sub) Hive emits LOCATE(sub, str)
SUBSTR(str, pos, len) SUBSTR(...) Hive emits SUBSTRING
Type: VARCHAR/CHAR/TEXT STRING
Type: DATETIME DATETIME

DDL

  • LIFECYCLE n — table retention in days
  • RANGE CLUSTERED BY (cols) [SORTED BY (cols)] INTO n BUCKETS
  • AUTO PARTITIONED BY (TRUNC_TIME(col, 'unit') [AS alias])
  • TBLPROPERTIES ('key'='value') — coexists correctly with LIFECYCLE
  • COMMENT on columns and tables

Development

uv sync            # install dependencies
uv run pytest      # run all tests

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlglot_maxcompute-0.4.0.tar.gz (32.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sqlglot_maxcompute-0.4.0-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file sqlglot_maxcompute-0.4.0.tar.gz.

File metadata

  • Download URL: sqlglot_maxcompute-0.4.0.tar.gz
  • Upload date:
  • Size: 32.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.20

File hashes

Hashes for sqlglot_maxcompute-0.4.0.tar.gz
Algorithm Hash digest
SHA256 64223cfc06dde5156e65d06fd4f1f812c5d630782ea1d4ff5111490242f36f7e
MD5 2e9a00bc48ce4722b32469c4208f3fd0
BLAKE2b-256 80083df75a727564d6784945a74e35fb4435fa90f15b7b20c68238e838ecbd26

See more details on using hashes here.

File details

Details for the file sqlglot_maxcompute-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sqlglot_maxcompute-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b07ecbe9251ee919b9ca9b2fb0306c7e338e87593b445727f241677db8077f0
MD5 bcc5d69c3fea92d74231f44f54c9585f
BLAKE2b-256 5111dc85aabe3db4cd7230d310d3c1ca76f71c1b082429e66e01d09a51cf06a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page