Skip to main content

A modular preprocessing package for Pandas Dataframe

Project description

logo

🥧ReciPies🐍

CI Black Platform License PyPI version shields.io arXiv

The ReciPies package is a preprocessing framework operating on Polars and Pandas dataframes. The backend can be chosen by the user. The operation of this package is inspired by the R-package recipes. This package allows the user to apply a number of extensible operations for imputation, feature generation/extraction, scaling, and encoding. It operates on modified Dataframe objects from the established data science package Pandas.

Installation

You can install ReciPies from pip using:

pip install recipies

Note that the package is called recipies on pip.

You can install ReciPies from source to ensure you have the latest version:

conda env update -f environment.yml
conda activate ReciPies
pip install -e .

Note that the last command installs the package called recipies.

Usage

To define preprocessing operations, one has to supply roles to the different columns of the Dataframe. This allows the user to create groups of columns which have a particular function. Then, we provide several "steps" that can be applied to the datasets, among which: Historical accumulation, Resampling the time resolution, A number of imputation methods, and a wrapper for any Scikit-learn preprocessing step. We believe to have covered any basic preprocessing needs for prepared datasets. Any missing step can be added by following the step interface.

📄Paper

If you use this code in your research, please cite the following publication (a standalone paper is in preparation):

@inproceedings{vandewaterYetAnotherICUBenchmark2024,
  title = {Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML},
  shorttitle = {Yet Another ICU Benchmark},
  booktitle = {The Twelfth International Conference on Learning Representations},
  author = {van de Water, Robin and Schmidt, Hendrik Nils Aurel and Elbers, Paul and Thoral, Patrick and Arnrich, Bert and Rockenschaub, Patrick},
  year = {2024},
  month = oct,
  urldate = {2024-02-19},
  langid = {english},
}

This paper can also be found on arxiv: https://arxiv.org/pdf/2306.05109.pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recipies-0.0.0.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

recipies-0.0.0-py3-none-any.whl (3.8 kB view details)

Uploaded Python 3

File details

Details for the file recipies-0.0.0.tar.gz.

File metadata

  • Download URL: recipies-0.0.0.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for recipies-0.0.0.tar.gz
Algorithm Hash digest
SHA256 c238644f5c9989d1ee5cde5bb179230007ef174c610f8067b6f9560310fc0149
MD5 ee0d6b738e237dec4f302e3d48248af8
BLAKE2b-256 c50020f3d3fd308dda01a566838a29e73407e1e8c702e6b819b36e7e0940dd9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for recipies-0.0.0.tar.gz:

Publisher: python-build.yaml on rvandewater/ReciPies

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file recipies-0.0.0-py3-none-any.whl.

File metadata

  • Download URL: recipies-0.0.0-py3-none-any.whl
  • Upload date:
  • Size: 3.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for recipies-0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e17aef83217a7193a4a504a20d1f873f98d74a21247911d8f259241cec33091a
MD5 d897d824f9f8b18a5daff7eca472948d
BLAKE2b-256 4c5aaee1453d346abfb8c9caa679f7191952ece5fa72de19865eb80900da796f

See more details on using hashes here.

Provenance

The following attestation bundles were made for recipies-0.0.0-py3-none-any.whl:

Publisher: python-build.yaml on rvandewater/ReciPies

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page