Skip to main content

Using df-and-order your interactions with dataframes become very clean and predictable.

Project description

Python 3.7 CodeFactor Maintainability codecov

🗄️ df-and-order

Yeah, it's just like Law & Order, but Dataframe & Order!

pip install df_and_order

Using df-and-order your interactions with dataframes become very clean and predictable.

  • Tired of absolute file paths in shared notebooks in your repository?
  • Can't remember how your dataframes were generated?
  • Want to have a reproducibility on data transformations?
  • Like declarative config-based solutions?

Good news for you!

Imagine the world where all you need to do for reading some dataframe you need just a few lines:

reader = MagicDfReader()
df = reader.read(df_id='user_activity_may_2020')

Maybe you are interested in some transformed version of that dataframe? No problem!

reader = MagicDfReader()
# ready to fit a model on!
model_input_df = reader.read(df_id='user_activity_may_2020', transform_id='model_input')

It is possible by having a config file that will look like this:

df_id: user_activity_may_2020 # here's the dataframe identifier
initial_df_format: csv
metadata: # some useful information about the dataset
  author: Data Man
  data_collection_date: 2020-05-01
transformed_df_format: csv
transforms:
  model_input: # here's the transform identifier
    in_memory: # means we want to perform transformations in memory every time we calling it, permanent transforms are supported as well
    - module_path: df_and_order.steps.DropColsTransformStep # file with the transformation's code
      params: # init params for the transformation
        cols:
        - redundant_col
    - module_path: df_and_order.steps.DatesTransformStep - another transformation
      params:
        cols:
        - date_col

Just by looking at the config you can say how the transformed dataframe was created.

Take a look at the more detailed overview to find more exciting stuff.

I also wrote an article to describe the benefits, check it out! There are lemurs and stuff.

Hope the lib will help somebody to boost the productivity.

Unit-tested, btw!

Requirements

pandas
python 3.7

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

df-and-order-0.2.3.tar.gz (41.2 kB view hashes)

Uploaded Source

Built Distribution

df_and_order-0.2.3-py3-none-any.whl (16.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page