Skip to main content

PySpark package for Catalyst logical plan capture with Base64 logging marker SPARK_LOGICAL_PLAN_CAPTURE_V2.

Project description

spark-logical-plan-capture-v2

Repository: https://github.com/feature-oriented-method/catalyst_logic_plan

PyPI-пакет для PySpark 3.5.x, который подключает Spark SQL extension:

  • перехватывает логические планы Catalyst
  • сериализует payload в Java binary form
  • кодирует payload в Base64
  • пишет маркер в лог: SPARK_LOGICAL_PLAN_CAPTURE_V2:<base64>

Installation

pip install spark-logical-plan-capture-v2

Usage

from pyspark.sql import SparkSession
from spark_logical_plan_capture import configure_spark_builder

builder = SparkSession.builder.appName("capture-demo").master("local[*]")
spark = configure_spark_builder(builder).getOrCreate()

spark.sql("select 1 as value").show()

Decode SQL from log line

from spark_logical_plan_capture import decode_captured_sql_from_logline

log_line = "SPARK_LOGICAL_PLAN_CAPTURE_V2:<base64>"
sql = decode_captured_sql_from_logline(spark, log_line)
print(sql)

Build and publish

  1. Соберите JVM jar:
sbt clean test package
  1. Скопируйте jar в Python package data:
python scripts/prepare_python_package.py
  1. Соберите wheel/sdist:
python -m build
  1. Опубликуйте:
python -m twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark_logical_plan_capture_v2-0.1.3.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file spark_logical_plan_capture_v2-0.1.3.tar.gz.

File metadata

File hashes

Hashes for spark_logical_plan_capture_v2-0.1.3.tar.gz
Algorithm Hash digest
SHA256 79a3395f527bb111e263d999a34388a152997f2c2b555d407f01eba9906643f2
MD5 59908fd67c273fe59503a2c95e955080
BLAKE2b-256 6203d52c04d9ecbd0d2bd96d359db31af29df8377b07d8cfcb1a36cf1945c854

See more details on using hashes here.

File details

Details for the file spark_logical_plan_capture_v2-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for spark_logical_plan_capture_v2-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5b811b82ac71eea3d5bbb3e251580d7bda1edb3aaf612f439b28460a029c59b2
MD5 c22681bb6bb695cb65fd908bb1d7498b
BLAKE2b-256 6ebba81266eb482c5c91d548746d5c8a5d36c875e3603d1293fe21d3d8129218

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page