A library containing various utility functions for playing with PySpark DataFrames
Project description
Spark-frame
What is it ?
Spark-frame is a library that brings several utility methods and transformation functions for PySpark DataFrames. These methods were initially part of the karadoc project used at Younited, but they don't rely on karadoc, so it makes more sense to keep them as standalone library.
Several of these methods were my initial inspiration to make the cousin project
bigquery-frame, which is why you will find similar
methods in transformations
and data_diff
for both spark_frame
and bigquery_frame
, except
the former runs on PySpark while the latter runs on BigQuery (obviously).
Installation
spark-frame is available on PyPi.
pip install spark-frame
Release notes
v0.0.3
- New transformation:
spark_frame.transformations.convert_all_maps_to_arrays
. - New transformation:
spark_frame.transformations.sort_all_arrays
. - New transformation:
spark_frame.transformations.harmonize_dataframes
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for spark_frame-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 141072984d3d85871aa6fa0790530b495ec0c4c4b9dd6e0f6caa0e735593fff6 |
|
MD5 | 250b67b0ed88a8fe64a90630cfbb4754 |
|
BLAKE2b-256 | a46db54bd0cff04089b2243e0f315546980eb7c61752a836d86f79d6a1f1035c |