Skip to main content

IOMETE's PySpark library that contains useful utilities for working with PySpark

Project description

Pyspark IOMETE Library

This library is providing a set of utility functions to speed up the development of pyspark applications.

Installation

pip install pyspark-iomete

Utility functions

get_spark_logger

This function is returning a spark logger instance.

As you may know, spark is using log4j as a logging framework. This function is returning a spark logger instance that is using the log4j logger. Standard python logging is not working with pyspark. The following function get the spark logger instance and returns it.

Usage:

from pyspark_iomete.utils import get_spark_logger
from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

# spark session and name will be used to create the logger
# both are optional
logger = get_spark_logger(spark=spark, name="my_custom_logger")

# spark session will be retrieved using SparkSession.getActiveSession() and name will be set to the current file name
logger = get_spark_logger()

Test utility functions

table_name_with_random_suffix

This function is returning a table name with a random suffix. This is useful for testing purposes.

Usage:

from pyspark_iomete.test_utils import table_name_with_random_suffix

table_name = table_name_with_random_suffix("my_table")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark_iomete-0.0.3.tar.gz (2.5 kB view hashes)

Uploaded Source

Built Distribution

pyspark_iomete-0.0.3-py3-none-any.whl (2.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page