Skip to main content

A Python toolkit for preprocessing datasets.

Project description

PrepDataKit

PrepDataKit is a Python package that provides a toolkit for preprocessing datasets. It offers various functions to assist in reading data from different file formats, summarizing datasets, handling missing values, and encoding categorical data.

Installation

You can install PrepDataKit using pip:

pip install prepdatakit

Usage

Here's an example of how to use PrepDataKit:

import prepdatakit

# Read a CSV file
data = prepdatakit.read_file('data.csv')

# Get summary statistics
summary = prepdatakit.get_summary(data)
print(summary)

# Handle missing values
clean_data = prepdatakit.handle_missing_values(data, strategy='remove')

# Encode categorical data
encoded_data = prepdatakit.one_hot_encode(clean_data, columns=['category'])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prepdatakit-1.3.2.tar.gz (1.6 kB view hashes)

Uploaded Source

Built Distribution

prepdatakit-1.3.2-py3-none-any.whl (1.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page