A data preprocessing library
Project description
Data Preprocessing Library - Aryan Sakhala This library provides a set of functions for preprocessing data in pandas DataFrames.
Installation You can install this package using pip:
pip install pyProcessAutom
Usage
To use this library, simply import the DataPreprocessor class from the data_preprocess module and instantiate it with a pandas DataFrame. You can then call various methods of the DataPreprocessor class to preprocess the data.
Here's an example of how to use this library:
import pandas as pd
from auto_preprocess.data_preprocess import DataPreprocessor
# Load data into a pandas DataFrame
df = pd.read_csv("my_data.csv")
# Preprocess the data using the DataPreprocessor class
preprocessor = DataPreprocessor(df)
preprocessor.remove_outliers()
preprocessor.scale(scaler_type='standard')
preprocessor.label_encode()
preprocessor.impute(method='mean')
preprocessor.drop()
preprocessed_df = preprocessor.df
Use the preprocessed data as needed
Functions This library provides the following functions for preprocessing data:
- remove_outliers(): Removes outliers from all numeric columns in the DataFrame.
- scale(scaler_type): Scales all numeric columns in the DataFrame using either a standard scaler or a min-max scaler.
- label_encode(): Encodes all columns with binary categories using label encoding.
- impute(method): Fills all columns with less than 10% NaN values with either the mean, median, or mode, as specified by the user.
- drop(): Drops all columns with more than 30% NaN values from the DataFrame. License
This project is licensed under the MIT License - see the LICENSE.txt file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyProcessAutom-1.4.5.tar.gz.
File metadata
- Download URL: pyProcessAutom-1.4.5.tar.gz
- Upload date:
- Size: 2.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
be76d938634180a9024aa3076f31c30f2144384054d5b7bd369e3b2cb2bdeb8c
|
|
| MD5 |
bfc954e07dc62ad6eaff097849025b02
|
|
| BLAKE2b-256 |
06fcff34766d4d1a0143c0a7e493ad8b972965aa1f5eb9212b176ed0922b4631
|
File details
Details for the file pyProcessAutom-1.4.5-py3-none-any.whl.
File metadata
- Download URL: pyProcessAutom-1.4.5-py3-none-any.whl
- Upload date:
- Size: 2.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
146febe64ff309a73754fd0707e4bca766fbb5d048be40a5485da72c4e49c7da
|
|
| MD5 |
bd72abceb8be0ef88ad8d758f38ba500
|
|
| BLAKE2b-256 |
0fdbdf55787bf26ef41546e00c208923a6d7f237b62df13739fa60b27a60db07
|