A SparkSQL formatter in Python based on https://github.com/zeroturnaround/sql-formatter, with customizations and extra features.

These details have not been verified by PyPI

Project links

Homepage

Project description

sparksqlformatter

A SparkSQL formatter in Python based on sql-formatter and its fork sql-formatter-plus, with customizations and extra features.

sparksqlformatter
Installation
- Install using pip
- Install from source
Compatibility
Usage
- Use as command-line tool
- Use as Python library
Style configurations

Installation

Install using pip

pip install sparksqlformatter

Install from source

Download source code.
Navigate to the source code directory.
Do python setup.py install or pip install ..

Compatibility

Supports Python 2.7 and 3.6+.

Usage

sparksqlformatter can be used as either a command-line tool or a Python library.

Use as command-line tool

usage: sparksqlformatter [-h] [-f FILES [FILES ...]] [-i] [--style STYLE]

Formatter for SparkSQL queries.

optional arguments:
  -h, --help            show this help message and exit
  -f FILES [FILES ...], --files FILES [FILES ...]
                        Paths to files to format.
  -i, --in-place        Format the files in place.
  --style STYLE         Style configurations for SparkSQL. Can be a path to a style config file or a dictionary.

Style

The --style argument specifies foramtting style. Supported language attributes can be found in style configurations.

There are two ways to specify style:

Path to a style config file. E.g.,

$ sparksqlformatter --style="<path_to_config_file>" -f <path_to_file1> <path_to_file2>

The style config file should have section [sparksqlformatter] and key-value pairs specifying attributes. E.g.,

[sparksqlformatter]
reservedKeywordUppercase = False
linesBetweenQueries = 2

Dictionary of configurations expressed as key-value pairs. E.g.,

$ sparksqlformatter --style="{'reservedKeywordUppercase': False}" -f <path_to_file1> <path_to_file2>

Use as Python library

Call sparksqlformatter.api.format_query() to format query in string:

>>> from sparksqlformatter import api
>>> query = 'select c1 from t1'
>>> api.format_query(query)
'SELECT\n    c1\nFROM\n    t0'

Call hiveql.formatter.api.format_file() to format query in file:

>>> from sparksqlformatter import api
>>> api.format_file(<path_to_file>, inPlace=False)
...

Style

Formatting style can be specified via the style parameter in the api format functions.

Similar to the command-line tool, there are two ways to create configurations when using sparksqlformatter as a Python library:

Path to a style config file

>>> from sparksqlformatter import api
>>> style = '<path_to_config_file>'
>>> query = 'select c1 FROM t0'
>>> api.format_query(query, style)
...

Dictionary

>>> from sparksqlformatter import api
>>> style = {'reservedKeywordUppercase': False}
>>> query = 'select c1 FROM t0'
>>> api.format_query(query, style)
'select\n    c1\nfrom\n    t0'

Style configurations

topLevelKeywords

A list of keywords that should start a query block when formatting. E.g.,

SELECT
    [block]
FROM
    [block]

Default to

TOP_LEVEL_KEYWORDS = [
    'ADD', 'AFTER', 'ALTER COLUMN', 'ALTER TABLE', 'CREATE TABLE', 'CROSS JOIN', 'DELETE FROM', 'EXCEPT',
    'FETCH FIRST', 'FROM', 'GROUP BY', 'GO', 'HAVING', 'INNER JOIN', 'INSERT INTO', 'INSERT', 'JOIN',
    'LEFT JOIN', 'LEFT OUTER JOIN', 'LIMIT', 'MODIFY', 'ORDER BY', 'OUTER JOIN', 'PARTITION BY', 'RIGHT JOIN',
    'RIGHT OUTER JOIN', 'SELECT', 'SET CURRENT SCHEMA', 'SET SCHEMA', 'SET', 'UPDATE', 'VALUES', 'WHERE'
]

topLevelKeywordsNoIndent

A list of top-level keywords that should not be indented when formatting. E.g., UNION in

SELECT
    ...
FROM
    ...
UNION
SELECT
    ...
FROM
    ...

Default to

TOP_LEVEL_KEYWORDS_NO_INDENT = ['INTERSECT', 'INTERSECT ALL', 'MINUS', 'UNION', 'UNION ALL']

newlineKeywords

A list of keywords that should start a newline when formatting. E.g., LEFT JOIN in

SELECT
    ...
FROM
    t0
    LEFT JOIN t1 ...
    LEFT JOIN t2 ...

Note that this is less restrictive than topLevelKeywords, since top-level keywords always start a newline. Default to

NEWLINE_KEYWORDS = [
    'AND', 'ELSE', 'LATERAL', 'ON', 'OPTIONS', 'OR', 'PARTITIONED BY', 'THEN', 'USING', 'WHEN', 'XOR'
]

stringTypes

A list of character pairs that enclose strings in the query language. Default to

['""', "''", '{}']

openParens

A list of strings that behave as opening parentheses in the query language. Default to

['(', 'CASE']

closeParens

A list of strings that behave as closing parentheses in the query language. Default to

[')', 'END']

lineCommentTypes

A list of prefixes to comments in the query language. Default to

['--']

reservedKeywordUppercase

A boolean indicating whether the keywords should be converted to uppercase when formatting. Default to True.

linesBetweenQueries

An integer that specifies the number of blank lines to put between (sub-)queries when formatting. E.g., with linesBetweenQueries = 1,

WITH t0 AS (
    ...
),

t1 AS (
    ...
)

SELECT
    ...
FROM
    ...

specialWordChars

A list of characters that require special handling when formatting. Default to [].

indent

A string that specifies one indent. Default to four blanks:

'    '

inlineMaxLength

Maximum length of an inline block. Default to 120.

splitOnComma

Whether items in top-level GROUP BY, ORDER BY clauses following SELECT ... FROM should be split on each comma or only split when exceeding inlineMaxLength. Default to True.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.1.12

Jun 25, 2021

0.1.11

Oct 10, 2020

0.1.10

Sep 1, 2020

0.1.9

Sep 1, 2020

0.1.8

Aug 28, 2020

0.1.7

Aug 27, 2020

0.1.6

Aug 27, 2020

0.1.5

Aug 18, 2020

0.1.4

Aug 18, 2020

0.1.3

Aug 15, 2020

0.1.2

Aug 15, 2020

This version

0.1.1

Aug 14, 2020

0.0.1

Aug 13, 2020

0.0.0

Aug 10, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparksqlformatter-0.1.1.tar.gz (25.3 kB view details)

Uploaded Aug 14, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sparksqlformatter-0.1.1-py2.py3-none-any.whl (33.2 kB view details)

Uploaded Aug 14, 2020 Python 2Python 3

File details

Details for the file sparksqlformatter-0.1.1.tar.gz.

File metadata

Download URL: sparksqlformatter-0.1.1.tar.gz
Upload date: Aug 14, 2020
Size: 25.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.1.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for sparksqlformatter-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`0cf706044f5c8cef8e41e0f89dcef32d8ef98e669a1a027ac14161cc0cffeeeb`
MD5	`4ad065e5ef6ba94fb31d7dc4dfc04a71`
BLAKE2b-256	`2b35ab1fe19e6314e6e8b9e3be8e8f46a291fb719a694c289f3c6e44937355c5`

See more details on using hashes here.

File details

Details for the file sparksqlformatter-0.1.1-py2.py3-none-any.whl.

File metadata

Download URL: sparksqlformatter-0.1.1-py2.py3-none-any.whl
Upload date: Aug 14, 2020
Size: 33.2 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.1.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for sparksqlformatter-0.1.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`7388e92183b2a0823826a0f23fd52d1409221388c004309e547d4b4b69d3070a`
MD5	`15df786bd339d1e7e3b28724169648b4`
BLAKE2b-256	`3a63570ba81f2a53eac494a6a792eeefa853ddd58291cceffb86201e03430375`

See more details on using hashes here.

sparksqlformatter 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

sparksqlformatter

Installation

Install using pip

Install from source

Compatibility

Usage

Use as command-line tool

Use as Python library

Style configurations

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes