A formatter for Python code and SparkSQL queries.
Project description
pyspark-sql-formatter
A formatter for Pyspark code with SQL queries. It relies on Python formatter yapf and SparkSQL formatter sparksqlformatter, both working indepdendently. User can specify configurations for either formatter separately.
Installation
Install using pip
pip install pysqlformatter
Install from source
- Download source code.
- Navigate to the source code directory.
- Do
python setup.py installorpip install ..
Compatibility
Supports Python 2.7 and 3.6+.
Usage
pysqlformatter can be used as either a command-line tool or a Python library.
Use as command-line tool
usage: pysqlformatter [-h] [-f FILES [FILES ...]] [-i] [--query-names QUERY_NAMES [QUERY_NAMES ...]] [--python-style PYTHON_STYLE] [--sparksql-style SPARKSQL_CONFIG]
Formatter for Pyspark code and SparkSQL queries.
optional arguments:
-h, --help show this help message and exit
-f FILES [FILES ...], --files FILES [FILES ...]
Paths to files to format.
-i, --in-place Format the files in place.
--python-style PYTHON_STYLE
Style for Python formatting, interface to https://github.com/google/yapf.
--sparksql-style SPARKSQL_CONFIG
Style for SparkSQL formatting, interface to https://github.com/largecats/sparksql-formatter.
--query-names QUERY_NAMES [QUERY_NAMES ...]
String variables with names containing these strings will be formatted as SQL queries. Default to 'query'.
E.g.,
$ pysqlformatter -f <path_to_file> --python-style='pep8' --sparksql-style="{'reservedKeywordUppercase': False}" --query-names query
Or using config files:
$ pysqlformatter -f <path_to_file> --python-style="<path_to_python_style_config_file>" --sparksql-style="<path_to_sparksql_config_file>" --query-names query
Use as Python library
Call pysqlformatter.api.format_script() to format script passed as string:
>>> from pysqlformatter import api
>>> script = '''query = 'select * from t0'\nspark.sql(query)'''
>>> api.format_script(script=script, pythonStyle='pep8', sparksqlConfig=sparksqlConfig(), queryNames=['query'])
"query = '''\nSELECT\n *\nFROM\n t0\n'''\nspark.sql(query)\n"
Call pysqlformatter.api.format_file() to format script in file:
>>> from pysqlformatter import api
>>> api.format_file(filePath=<path_to_file>, pythonStyle='pep8', sparksqlConfig=sparksqlConfig(), queryNames=['query'], inPlace=False)
...
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pysqlformatter-0.0.1.tar.gz.
File metadata
- Download URL: pysqlformatter-0.0.1.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3de9ad9cd2d51821fc8759866dc355bdab0d6eb60ee558b1826b5421ce0e151e
|
|
| MD5 |
431c29cfc6a143f40c3a5d94ec44b43b
|
|
| BLAKE2b-256 |
87b574c8531befa3f3f352f06e7ca435d5a69439ce47062a8ce42534a96106a8
|
File details
Details for the file pysqlformatter-0.0.1-py2.py3-none-any.whl.
File metadata
- Download URL: pysqlformatter-0.0.1-py2.py3-none-any.whl
- Upload date:
- Size: 13.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0a38098b46fb53fb890fdddf136a613b850e379ad9eeca438d48d069f1d729d
|
|
| MD5 |
c109e14a3caf3ae014813e7597101277
|
|
| BLAKE2b-256 |
fadd29f3333b7532608b88a3dc4c2dc9452f1fd609e3a079a82e8eb96f51f10a
|