A package for handling various data preprocessing tasks
Project description
Cleaner Panda
Programming For Data Engineering course final project
https://pypi.org/project/cleaner-panda/
Installation
pip install cleaner-panda
Modules
Missing Value Handler
strategy enum {MEAN, MEDIAN, CONSTANT, REMOVE_ROW, REMOVE_COLUMN, FORWARD_BACKWARD}cont_int = 0, const_str =”none”, const_date=01.01.2024…replace_missing_values(dataFrame, strategy=”strategy.MEAN”, column=0)replace_mean(dataframe, column)replace_median(dataframe, column)replace_constant(dataframe, column, constant)replace_remove_row(dataframe, column)replace_remove_column(dataframe, column)replace_forward_backward(dataframe, column)
Outlier Handler
identify_outliers_iqr(data, threshold=1.5)handle_outliers_iqr(data, threshold=1.5, replacement=None)//replacement: Value to replace outliers with (e.g., median, mean) or None to remove outliers
Scaler
standardize_data(dataframe)normalize_data(dataframe)robust_scale_data(dataframe)normalize_vectors(dataframe)log_transform_data(dataframe)
Text Cleaner
remove_common_words(dataframe, column)convert_to_lowercase(dataframe, column)// Stopwords are words like "the", "is", "and", "in", etc., that occur frequently in a languageremove_punctuation(dataframe, column)lemmatization(dataframe, column)expand_contractions(dataframe, column)// (e.g., "can't" to "cannot", "won't" to "will not")remove_special_characters(dataframe, column, remove=[‘.’])remove_numerical(dataframe, column)filter_words(dataframe, column, remove=[“fuck”])
Data Type Converter
Categorical Encoder
label_encoding(dataframe, column)one_hot_encoding(dataframe, column)ordinal_encoding(dataframe, column)
Date Time Handler
convert_date_to_strings(dataframe column)extract_components(dataframe, column)reformat_date(dataframe, column)calculate_datetime_differences()convert_datetime_to_different_timezonesshift_time()handle_irregular_time_intervals()
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cleaner_panda-0.1.9.tar.gz
(23.3 kB
view details)
File details
Details for the file cleaner_panda-0.1.9.tar.gz.
File metadata
- Download URL: cleaner_panda-0.1.9.tar.gz
- Upload date:
- Size: 23.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e91c983d28911ce8848a55b83d6da29d0d562112c74de0be84f6e92f351fe831
|
|
| MD5 |
9b0653182a824de04f4e55746d4ca184
|
|
| BLAKE2b-256 |
e304f8d5055c822c4bc906e7a9128ba74babd471870b14b3bae34730d38d6b7f
|