A package for handling various data preprocessing tasks
Project description
Cleaner Panda
Programming For Data Engineering course final project
https://pypi.org/project/cleaner-panda/
Installation
pip install cleaner-panda
Modules
Missing Value Handler
strategy enum {MEAN, MEDIAN, CONSTANT, REMOVE_ROW, REMOVE_COLUMN, FORWARD_BACKWARD}
cont_int = 0, const_str =”none”, const_date=01.01.2024…
replace_missing_values(dataFrame, strategy=”strategy.MEAN”, column=0)
replace_mean(dataframe, column)
replace_median(dataframe, column)
replace_constant(dataframe, column, constant)
replace_remove_row(dataframe, column)
replace_remove_column(dataframe, column)
replace_forward_backward(dataframe, column)
Outlier Handler
identify_outliers_iqr(data, threshold=1.5)
handle_outliers_iqr(data, threshold=1.5, replacement=None)
//replacement: Value to replace outliers with (e.g., median, mean) or None to remove outliers
Scaler
standardize_data(dataframe)
normalize_data(dataframe)
robust_scale_data(dataframe)
normalize_vectors(dataframe)
log_transform_data(dataframe)
Text Cleaner
remove_common_words(dataframe, column)
convert_to_lowercase(dataframe, column)
// Stopwords are words like "the", "is", "and", "in", etc., that occur frequently in a languageremove_punctuation(dataframe, column)
lemmatization(dataframe, column)
expand_contractions(dataframe, column)
// (e.g., "can't" to "cannot", "won't" to "will not")remove_special_characters(dataframe, column, remove=[‘.’])
remove_numerical(dataframe, column)
filter_words(dataframe, column, remove=[“fuck”])
Data Type Converter
Categorical Encoder
label_encoding(dataframe, column)
one_hot_encoding(dataframe, column)
ordinal_encoding(dataframe, column)
Date Time Handler
convert_date_to_strings(dataframe column)
extract_components(dataframe, column)
reformat_date(dataframe, column)
calculate_datetime_differences()
convert_datetime_to_different_timezones
shift_time()
handle_irregular_time_intervals()
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cleaner_panda-0.1.9.tar.gz
(23.3 kB
view details)
File details
Details for the file cleaner_panda-0.1.9.tar.gz
.
File metadata
- Download URL: cleaner_panda-0.1.9.tar.gz
- Upload date:
- Size: 23.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e91c983d28911ce8848a55b83d6da29d0d562112c74de0be84f6e92f351fe831 |
|
MD5 | 9b0653182a824de04f4e55746d4ca184 |
|
BLAKE2b-256 | e304f8d5055c822c4bc906e7a9128ba74babd471870b14b3bae34730d38d6b7f |