Skip to main content

No project description provided

Project description

dataflat

A library to flatten all this annoiyng nested keys and columns on Dictionaries, Pandas Dataframes and Spark (pyspark) Dataframes.

Installation

pip install dataflat

Get started

How to instantiate a Flattener:

First import the FlattenerOptions, CaseTranslator and Flattener classes

from dataflat.flattener_handler import FlattenerOptions, CaseTranslatorOptions, Flattener

The following step is assing the required variables for the flattening process.

  • reference_name: Used to assign a key name to each resulting dictionary or DataFrame from the transform function.
  • id_key: Used as identifier value in case there is any nested list, that will be exploded to a new entire dictionary or dataframe, the exploded Dataframes will have a reference to it parent Dataframe or dictionary.
  • black_list: A list of keys or columns that will be skipped during the flattening process.
  • replace_dots: Specify if the dots (used as hierarchical separator) will be replaced with underscores.
  • from_case, to_case: Specify the keys or columns current case (snake, camel...) and the desired output case for each key.
# Default values:
#   from_case = CaseTranslatorOptions.CAMEL_CASE
#   to_case = CaseTranslatorOptions.SNAKE_CASE
#   replace_dots = False
#   chunk size = 500
reference_name = 'data'
id_key = 'id'
black_list = ['keys','or','columns','to','skip']
replace_dots = False
from_case = CaseTranslatorOptions.CAMEL_CASE
to_case = CaseTranslatorOptions.SNAKE_CASE

Later, select the desired flattener from the options, according to the current type of your data (dict, pandas.DataFrame, spark.DataFrame)

custom_flattener = FlattenerOptions.DICTIONARY
custom_flattener = FlattenerOptions.PANDAS_DF
custom_flattener = FlattenerOptions.SPARK_DF

Finally instantiate the flattener class, and apply the transform function.

flattener = Flattener().handler(custom_flattener, from_case, to_case, replace_dots)
flattener_dict.transform(data, id_key, black_list, reference_name)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataflat-1.1.0.tar.gz (13.6 kB view hashes)

Uploaded Source

Built Distribution

dataflat-1.1.0-py3-none-any.whl (18.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page