Skip to main content

An easily customizable SQL parser and transpiler

Project description

SQLGlot logo

SQLGlot is a no-dependency SQL parser, transpiler, optimizer, and engine. It can be used to format SQL or translate between 19 different dialects like DuckDB, Presto, Spark, Snowflake, and BigQuery. It aims to read a wide variety of SQL inputs and output syntactically and semantically correct SQL in the targeted dialects.

It is a very comprehensive generic SQL parser with a robust test suite. It is also quite performant, while being written purely in Python.

You can easily customize the parser, analyze queries, traverse expression trees, and programmatically build SQL.

Syntax errors are highlighted and dialect incompatibilities can warn or raise depending on configurations. However, it should be noted that SQL validation is not SQLGlot’s goal, so some syntax errors may go unnoticed.

Learn more about the SQLGlot API in the documentation.

Contributions are very welcome in SQLGlot; read the contribution guide to get started!

Since the community has not had time to merge, temporarily package doris to use.

Install

From PyPI:

pip3 install sqlglot-doris

Examples

Formatting and Transpiling

Easily translate from one dialect to another. For example, date/time functions vary between dialects and can be hard to deal with:

import sqlglot
sqlglot.transpile("SELECT TIME_TO_STR('2020-01-01', '%Y-%m-%d')", read="hive", write="doris")[0]
SELECT DATE_FORMAT('2020-01-01', '%Y-%m-%d')

SQLGlot can even translate custom time formats:

import sqlglot
sqlglot.transpile("SELECT UNIX_TO_STR(x, y)", read="hive", write="doris")[0]
"SELECT FROM_UNIXTIME(x, y)"

As another example, let's suppose that we want to read in a SQL query that contains a CTE and a cast to REAL, and then transpile it to Spark, which uses backticks for identifiers and FLOAT instead of REAL:

import sqlglot

sql = """WITH baz AS (SELECT a, c FROM foo WHERE a = 1) SELECT f.a, b.b, baz.c, CAST("b"."a" AS REAL) d FROM foo f JOIN bar b ON f.a = b.a LEFT JOIN baz ON f.a = baz.a"""
print(sqlglot.transpile(sql, write="spark", identify=True, pretty=True)[0])
WITH `baz` AS (
  SELECT
    `a`,
    `c`
  FROM `foo`
  WHERE
    `a` = 1
)
SELECT
  `f`.`a`,
  `b`.`b`,
  `baz`.`c`,
  CAST(`b`.`a` AS FLOAT) AS `d`
FROM `foo` AS `f`
JOIN `bar` AS `b`
  ON `f`.`a` = `b`.`a`
LEFT JOIN `baz`
  ON `f`.`a` = `baz`.`a`

Comments are also preserved on a best-effort basis when transpiling SQL code:

sql = """
/* multi
   line
   comment
*/
SELECT
  tbl.cola /* comment 1 */ + tbl.colb /* comment 2 */,
  CAST(x AS INT), # comment 3
  y               -- comment 4
FROM
  bar /* comment 5 */,
  tbl #          comment 6
"""

print(sqlglot.transpile(sql, read='doris', pretty=True)[0])
/* multi
   line
   comment
*/
SELECT
  tbl.cola /* comment 1 */ + tbl.colb /* comment 2 */,
  CAST(x AS INT), /* comment 3 */
  y /* comment 4 */
FROM bar /* comment 5 */, tbl /*          comment 6 */

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlglot-doris-1.0.5.dev3.tar.gz (8.4 MB view hashes)

Uploaded Source

Built Distribution

sqlglot_doris-1.0.5.dev3-py3-none-any.whl (285.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page