Using df-and-order your interactions with dataframes become very clean and predictable.
Project description
🗄️ df-and-order
Yeah, it's just like Law & Order, but Dataframe & Order!
pip install df_and_order
Using df-and-order your interactions with dataframes become very clean and predictable.
- Tired of absolute file paths in shared notebooks in your repository?
- Can't remember how your dataframes were generated?
- Want to have a reproducibility on data transformations?
- Like declarative config-based solutions?
Good news for you!
Imagine the world where all you need to do for reading some dataframe you need just a few lines:
reader = MagicDfReader()
df = reader.read(df_id='user_activity_may_2020')
Maybe you are interested in some transformed version of that dataframe? No problem!
reader = MagicDfReader()
# ready to fit a model on!
model_input_df = reader.read(df_id='user_activity_may_2020', transform_id='model_input')
It is possible by having a config file that will look like this:
df_id: user_activity_may_2020 # here's the dataframe identifier
initial_df_format: csv
metadata: # some useful information about the dataset
author: Data Man
data_collection_date: 2020-05-01
transformed_df_format: csv
transforms:
model_input: # here's the transform identifier
in_memory: # means we want to perform transformations in memory every time we calling it, permanent transforms are supported as well
- module_path: df_and_order.steps.DropColsTransformStep # file with the transformation's code
params: # init params for the transformation
cols:
- redundant_col
- module_path: df_and_order.steps.DatesTransformStep - another transformation
params:
cols:
- date_col
Just by looking at the config you can say how the transformed dataframe was created.
Take a look at the more detailed overview to find more exciting stuff.
I also wrote an article to describe the benefits, check it out! There are lemurs and stuff.
Hope the lib will help somebody to boost the productivity.
Unit-tested, btw!
Requirements
pandas
python 3.7
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for df_and_order-0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e6974a7a006ab0bb94acc086d091f7f81c8e72175c8c521de316bac3543f690 |
|
MD5 | 12ee76a6df0d6f2557b89c149c785568 |
|
BLAKE2b-256 | d7810bd06537250778e96260b39b3e88bbc777cf5ad7c3e5e215a8482a615663 |