Skip to main content

Spalah is a set of PySpark dataframe helpers

Project description

spalah

Spalah is a set of python helpers to deal with PySpark dataframes

Installation

Use the package manager pip to install foobar.

pip install spalah

Usage

from spalah.dataframe import slice_dataframe
from pyspark.sql import SparkSession

slice_dataframe(
    input_dataframe=df,
    columns_to_include=[],
    columns_to_exclude=["d", "e"],
    nullify_only=False
)

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spalah-0.1.0.tar.gz (5.2 kB view hashes)

Uploaded Source

Built Distribution

spalah-0.1.0-py2.py3-none-any.whl (5.3 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page