Skip to main content

Parsing utilities to validate and split SQL statements for CrateDB.

Project description

CrateDB SQL Parser for Python

License PyPI version PyPI downloads

Tests Test coverage Python versions

Status

This package provides utilities to validate and split SQL statements specifically designed for CrateDB.

It is built upon CrateDB's antlr4 grammar, ensuring accurate parsing tailored to CrateDB's SQL dialect.

It draws inspiration from sqlparse.

Installation.

pip install cratedb-sqlparse

Usage.

Simple example

from cratedb_sqlparse import sqlparse

query = """
    SELECT * FROM SYS.SHARDS;
    INSERT INTO doc.tbl VALUES (1);
"""
statements = sqlparse(query)

print(len(statements))
# 2

select_query = statements[0]

print(select_query.query)
# 'SELECT * FROM SYS.SHARDS'

print(select_query.type)
# 'SELECT'

print(select_query.tree)
# (statement (query (queryNoWith (queryTerm (querySpec SELECT (selectItem *) FROM (relation (aliasedRelation (relationPrimary (table (qname (ident (unquotedIdent SYS)) . (ident (unquotedIdent (nonReserved SHARDS)))))))))))))

Exceptions and errors.

By default exceptions are stored in statement.exception

from cratedb_sqlparse import sqlparse

query = """
SELECT COUNT(*) FROM doc.tbl f HERE f.id = 1;

INSERT INTO doc.tbl VALUES (1, 23, 4);
"""
statements = sqlparse(query)
stmt = statements[0]

if stmt.exception:
    print(stmt.exception.error_message)
    # InputMismatchException[line 2:31 mismatched input 'HERE' expecting {<EOF>, ';'}]

    print(stmt.exception.original_query_with_error_marked)
    # SELECT COUNT(*) FROM doc.tbl f HERE f.id = 1;
    #                                ^^^^
    # 
    # INSERT INTO doc.tbl VALUES (1, 23, 4);

    print(stmt.exception.offending_token.text)
    # HERE

In some situations, you might want sqlparse to raise an exception.

You can set raise_exception to True

from cratedb_sqlparse import sqlparse

sqlparse('SELECT COUNT(*) FROM doc.tbl f WHERE .id = 1;', raise_exception=True)

# cratedb_sqlparse.parser.ParsingException: NoViableAltException[line 1:37 no viable alternative at input 'SELECT COUNT(*) FROM doc.tbl f WHERE .']

Catch the exception:

from cratedb_sqlparse import sqlparse, ParsingException

try:
    t = sqlparse('SELECT COUNT(*) FROM doc.tbl f WHERE .id = 1;', raise_exception=True)[0]
except ParsingException:
    print('Catched!')

Note:

It will only raise the first exception it finds, even if you pass in several statements.

Query metadata.

Query metadata can be read with statement.metadata

from cratedb_sqlparse import sqlparse

stmt = sqlparse("SELECT A, B FROM doc.tbl12")

print(stmt.metadata)
# Metadata(tables=[Table(schema='doc', name='tbl12')], parameterized_properties={}, with_properties={})

Query properties.

Properties defined within a WITH statement, statement.metadata.with_properties.

from cratedb_sqlparse import sqlparse

stmt = sqlparse("""
    CREATE TABLE doc.tbl12 (A TEXT) WITH (
      "allocation.max_retries" = 5,
      "blocks.metadata" = false
    );
""")[0]

print(stmt.metadata)
# Metadata(tables=[Table(schema='doc', name='tbl12')], with_properties={'allocation.max_retries': '5', 'blocks.metadata': 'false'})

Table name

print(stmt.metadata.tables)
# [Table(schema='doc', name='tbl12')]

table = stmt.metadata.tables[0]
print(table.schema, table.name, table.fqn, sep='\n')
# doc
# tbl12
# '"doc"."tbl12"'

Parameterized properties.

Parameterized properties are properties without a real defined value, marked with a dollar string, metadata.parameterized_properties

from cratedb_sqlparse import sqlparse

stmt = sqlparse("""
    CREATE TABLE doc.tbl12 (A TEXT) WITH (
    "allocation.max_retries" = 5,
    "blocks.metadata" = $1
);
""")[0]

print(stmt.metadata)
# Metadata(tables=[Table(schema='doc', name='tbl12')], parameterized_properties={'blocks.metadata': '$1'}, with_properties={'allocation.max_retries': '5', 'blocks.metadata': '$1'})

In this case, blocks.metadata will be in with_properties and parameterized_properties as well.

For values to be picked up they need to start with a dollar '$' and be preceded by integers, e.g. '$1', '$123' - '$123abc' would not be valid.

Development

Set up environment

git clone https://github.com/crate/cratedb-sqlparse

cd cratedb-sqlparse/cratedb_sqlparse_py

python3 -m venv .venv

source .venv/bin/activate

pip install --editable='.[develop,generate,release,test]'

Everytime you open a shell again you would need to run source .venv/bin/activate to use poe commands.

Run lint and tests with coverage.

poe check

Run only tests

poe test

Run a specific test.

poe test -k test_sqlparse_collects_exceptions_2

Run linter

poe lint

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cratedb_sqlparse-0.0.10.tar.gz (141.0 kB view hashes)

Uploaded Source

Built Distribution

cratedb_sqlparse-0.0.10-py3-none-any.whl (142.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page