Skip to main content

Python wrapper for Spark logical plan capture extension

Project description

spark-logical-plan-capture

Python package for connecting the Spark extension io.github.mt.logicplan.LogicalPlanCaptureExtension via spark-submit or from PySpark code.

Install

pip install spark-logical-plan-capture

Usage in PySpark

from pyspark.sql import SparkSession
from spark_logical_plan_capture import get_spark_conf, get_jar_path

conf = get_spark_conf()

spark = (
    SparkSession.builder
    .config("spark.sql.extensions", conf["spark.sql.extensions"])
    .config("spark.jars", conf["spark.jars"])
    .getOrCreate()
)

Usage with spark-submit

spark-submit \
  --conf "spark.sql.extensions=io.github.mt.logicplan.LogicalPlanCaptureExtension" \
  --jars "$(python -c 'from spark_logical_plan_capture import get_jar_path; print(get_jar_path())')" \
  your_job.py

Current package line is built for Spark 3.5.2, Scala 2.13.8, JVM 11.0.25.

Restore SQL from Project JSON

The package also contains a SQL restorer for logical plan JSON with org.apache.spark.sql.catalyst.plans.logical.Project:

from spark_logical_plan_capture import project_json_to_sql

plan_json = """
{
  "class": "org.apache.spark.sql.catalyst.plans.logical.Project",
  "projectList": [
    {
      "class": "org.apache.spark.sql.catalyst.expressions.Alias",
      "name": "a",
      "child": {
        "class": "org.apache.spark.sql.catalyst.expressions.Literal",
        "value": 1,
        "dataType": "integer"
      }
    }
  ],
  "child": {"class": "org.apache.spark.sql.catalyst.plans.logical.OneRowRelation"}
}
"""

sql = project_json_to_sql(plan_json)
# SELECT 1 AS a FROM (SELECT 1)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark_logical_plan_capture-0.2.1.tar.gz (20.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spark_logical_plan_capture-0.2.1-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file spark_logical_plan_capture-0.2.1.tar.gz.

File metadata

File hashes

Hashes for spark_logical_plan_capture-0.2.1.tar.gz
Algorithm Hash digest
SHA256 a378db118b2d8b0468ced8735cc9cdfa7156a721f3a7439410a9cc1edfa8c3dd
MD5 b3d95d678871f45f438986c63fd954ac
BLAKE2b-256 8f8ce696c97830ba9be7b046c7a8f6c94289b2c1a313f964b1d47eda6167b69f

See more details on using hashes here.

File details

Details for the file spark_logical_plan_capture-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for spark_logical_plan_capture-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f4c458ce2c2c9c59c9ff189ba7428c6784942cad6485f5430001f23e8d64db40
MD5 85a0982d148c1cd4ac193ffef24d9f20
BLAKE2b-256 0c71332b81d333daa61a512e16fef6f912fb3a09efcd7bc581b4b2c01ccbfc61

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page