Cleans raw data, runs baseline models
Project description
pywedge
Cleans raw data, runs baseline models.
Cleans the raw dataframe to fed into ML models. Following data pre_processing will be carried out,
- segregating numeric & categorical columns
- missing values imputation for numeric & categorical columns
- standardization
- feature importance
- SMOTE
- baseline model
Pre_process_data() Inputs:
- train = train dataframe
- test = stand out test dataframe (without target column)
- c = any redundant column to be removed (like ID column etc., at present supports a single column removal, subsequent version will provision multiple column removal requirements)
- y = target column name as a string
- type = Classification / Regression
Returns:
- new_X (cleaned feature columns in dataframe)
- new_y (cleaned target column in dataframe)
- new_test (cleaned stand out test dataset)
baseline_model()
- For classification - classification_summary()
- For Regression - Regression_summary()
Inputs:
- new_x
- new_y
Returns: Various baseline model metrics
THIS IS IN BETA VERSION
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pywedge-0.1.tar.gz
(5.5 kB
view hashes)
Built Distribution
pywedge-0.1-py3-none-any.whl
(6.6 kB
view hashes)