Skip to main content

A data preprocessing library

Project description

Data Preprocessing Library - Aryan Sakhala

This library provides a set of functions for preprocessing data in pandas DataFrames.

Installation You can install this package using pip:

pip install pyProcessAutom

Usage

To use this library, simply import the DataPreprocessor class from the data_preprocess module and instantiate it with a pandas DataFrame. You can then call various methods of the DataPreprocessor class to preprocess the data.

Here's an example of how to use this library:

import pandas as pd
from auto_preprocess.data_preprocess import DataPreprocessor

# Load data into a pandas DataFrame
df = pd.read_csv("my_data.csv")

# Preprocess the data using the DataPreprocessor class
preprocessor = DataPreprocessor(df)
preprocessor.remove_outliers()
preprocessor.scale(scaler_type='standard')
preprocessor.label_encode()
preprocessor.impute(method='mean')
preprocessor.drop()
preprocessed_df = preprocessor.df

Use the preprocessed data as needed

Functions This library provides the following functions for preprocessing data:

  • remove_outliers(): Removes outliers from all numeric columns in the DataFrame.
  • scale(scaler_type): Scales all numeric columns in the DataFrame using either a standard scaler or a min-max scaler.
  • label_encode(): Encodes all columns with binary categories using label encoding.
  • impute(method): Fills all columns with less than 10% NaN values with either the mean, median, or mode, as specified by the user.
  • drop(): Drops all columns with more than 30% NaN values from the DataFrame. License

This project is licensed under the MIT License - see the LICENSE.txt file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyProcessAutom-1.4.6.tar.gz (2.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyProcessAutom-1.4.6-py3-none-any.whl (3.0 kB view details)

Uploaded Python 3

File details

Details for the file pyProcessAutom-1.4.6.tar.gz.

File metadata

  • Download URL: pyProcessAutom-1.4.6.tar.gz
  • Upload date:
  • Size: 2.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.4

File hashes

Hashes for pyProcessAutom-1.4.6.tar.gz
Algorithm Hash digest
SHA256 df7bd29d7146237e4c716b426b253c13ea2c25b06852b403bf30d74e765246f8
MD5 e6f7f8913dd9fb059781a08754736b53
BLAKE2b-256 f39e04d64abd3d23ab300470835a62bac5c2fdd7f8b98080c0ab7446a3272874

See more details on using hashes here.

File details

Details for the file pyProcessAutom-1.4.6-py3-none-any.whl.

File metadata

  • Download URL: pyProcessAutom-1.4.6-py3-none-any.whl
  • Upload date:
  • Size: 3.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.4

File hashes

Hashes for pyProcessAutom-1.4.6-py3-none-any.whl
Algorithm Hash digest
SHA256 15c28baa9968bd4010ddcf08e1412d5ce17cc824103a1477b6779a11653d7751
MD5 83529aea4d71fe565e8057fda8c3e6a2
BLAKE2b-256 09b1d7f59cb7c0e370b931c7a5be928aa779c1230fa576f7df5dfef4e6ba2216

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page