Skip to main content

An easily customizable SQL parser and transpiler

Project description

SQLGlot logo

SQLGlot is a no-dependency SQL parser, transpiler, optimizer, and engine. It can be used to format SQL or translate between 19 different dialects like DuckDB, Presto, Spark, Snowflake, and BigQuery. It aims to read a wide variety of SQL inputs and output syntactically and semantically correct SQL in the targeted dialects.

It is a very comprehensive generic SQL parser with a robust test suite. It is also quite performant, while being written purely in Python.

You can easily customize the parser, analyze queries, traverse expression trees, and programmatically build SQL.

Syntax errors are highlighted and dialect incompatibilities can warn or raise depending on configurations. However, it should be noted that SQL validation is not SQLGlot’s goal, so some syntax errors may go unnoticed.

Learn more about the SQLGlot API in the documentation.

Contributions are very welcome in SQLGlot; read the contribution guide to get started!

Since the community has not had time to merge, temporarily package doris to use.

Install

From PyPI:

pip3 install sqlglot-doris

Examples

Formatting and Transpiling

Easily translate from one dialect to another. For example, date/time functions vary between dialects and can be hard to deal with:

import sqlglot
sqlglot.transpile("SELECT TIME_TO_STR('2020-01-01', '%Y-%m-%d')", read="hive", write="doris")[0]
SELECT DATE_FORMAT('2020-01-01', '%Y-%m-%d')

SQLGlot can even translate custom time formats:

import sqlglot
sqlglot.transpile("SELECT UNIX_TO_STR(x, y)", read="hive", write="doris")[0]
"SELECT FROM_UNIXTIME(x, y)"

As another example, let's suppose that we want to read in a SQL query that contains a CTE and a cast to REAL, and then transpile it to Spark, which uses backticks for identifiers and FLOAT instead of REAL:

import sqlglot

sql = """WITH baz AS (SELECT a, c FROM foo WHERE a = 1) SELECT f.a, b.b, baz.c, CAST("b"."a" AS REAL) d FROM foo f JOIN bar b ON f.a = b.a LEFT JOIN baz ON f.a = baz.a"""
print(sqlglot.transpile(sql, write="spark", identify=True, pretty=True)[0])
WITH `baz` AS (
  SELECT
    `a`,
    `c`
  FROM `foo`
  WHERE
    `a` = 1
)
SELECT
  `f`.`a`,
  `b`.`b`,
  `baz`.`c`,
  CAST(`b`.`a` AS FLOAT) AS `d`
FROM `foo` AS `f`
JOIN `bar` AS `b`
  ON `f`.`a` = `b`.`a`
LEFT JOIN `baz`
  ON `f`.`a` = `baz`.`a`

Comments are also preserved on a best-effort basis when transpiling SQL code:

sql = """
/* multi
   line
   comment
*/
SELECT
  tbl.cola /* comment 1 */ + tbl.colb /* comment 2 */,
  CAST(x AS INT), # comment 3
  y               -- comment 4
FROM
  bar /* comment 5 */,
  tbl #          comment 6
"""

print(sqlglot.transpile(sql, read='doris', pretty=True)[0])
/* multi
   line
   comment
*/
SELECT
  tbl.cola /* comment 1 */ + tbl.colb /* comment 2 */,
  CAST(x AS INT), /* comment 3 */
  y /* comment 4 */
FROM bar /* comment 5 */, tbl /*          comment 6 */

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlglot-doris-1.0.4.dev3.tar.gz (8.4 MB view details)

Uploaded Source

Built Distribution

sqlglot_doris-1.0.4.dev3-py3-none-any.whl (285.3 kB view details)

Uploaded Python 3

File details

Details for the file sqlglot-doris-1.0.4.dev3.tar.gz.

File metadata

  • Download URL: sqlglot-doris-1.0.4.dev3.tar.gz
  • Upload date:
  • Size: 8.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for sqlglot-doris-1.0.4.dev3.tar.gz
Algorithm Hash digest
SHA256 ea8bff04691e88d7bdf7d6fe275ac07f78229fbbee3373d536417f8a73dbf97a
MD5 311222ed46bc29e90acdee530be2203c
BLAKE2b-256 7ae160d58993e6329eb2154a07d72ec480db5ea7be624b9f4774af735e08203c

See more details on using hashes here.

File details

Details for the file sqlglot_doris-1.0.4.dev3-py3-none-any.whl.

File metadata

File hashes

Hashes for sqlglot_doris-1.0.4.dev3-py3-none-any.whl
Algorithm Hash digest
SHA256 150d97eee7c9e992cbfaf02bfc48205ef707dfb73e86b462a7cde66bf072e96f
MD5 d235825a47aa0364f9c3d54cfeb62deb
BLAKE2b-256 8a7978261d4769c33fee9a7669eafb8f803dac20d7a56b8bd23e55780c230335

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page