Skip to main content

No project description provided

Project description

A Deep Learning Data Analysis Package

DataPrep and Visualization Toolkit

This is a Python package designed to streamline the process of preparing datasets for machine learning workflows and visualizing time-series data. This package provides essential functionality for splitting datasets, applying data scaling techniques, and visualizing feature trends, making it easier to prepare data for modeling. This is version 0.3.0.1 of the package, and we plan to add more features in future updates!

Key Features

Exponential Weighted Mean Smoothing:

Smooths input features using an exponential weighted mean (EWM) to help reduce noise in the data before training.

Train-Test Split with Optional Validation Split:

The data_prep() function handles the splitting of data into training, testing, and (optionally) validation sets, with a variety of user-defined parameters for customization.

Scaling Options:

Choose between two widely-used scaling methods—MinMaxScaler and StandardScaler—to normalize your data and ensure that it’s well-prepared for machine learning models.

Support for Oversampling (SMOTE):

The package offers optional oversampling using the SMOTE technique to handle imbalanced datasets effectively.

Dataset Visualization:

The dataset_visualize() function allows you to easily visualize time-series data for selected features, providing insights into trends and patterns in the dataset.

Installation

You can install the package using pip:

pip install dl-data-analysis

Data Preparation

import pandas as pd
from your_package_name import data_prep

# Example usage
X_train, X_test, y_train, y_test = data_prep(
    x_dataframe=my_data, 
    y_data=labels, 
    test_ratio=0.3, 
    validation=True, 
    scaler_type="min_max", 
    oversample=True
)

Visualization

from your_package_name import dataset_visualize

# Example visualization
dataset_visualize(
    pd_dataframe=my_data, 
    feature_list=['feature_1', 'feature_2'], 
    Name='Sensor', 
    list=[1, 2, 3]
)

Planned Updates

This is just the first version of the package. We have plans to introduce additional features in the future, including:

  • More scaling and normalization techniques.
  • Advanced data preprocessing capabilities.
  • Enhanced visualization functions.
  • Support for more types of datasets and tasks.

Stay tuned for more!

Contributing

Contributions are welcome! If you have any ideas or would like to contribute to the project, please open an issue or submit a pull request.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dl_data_analysis-0.3.0.1.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

dl_data_analysis-0.3.0.1-py3-none-any.whl (3.8 kB view details)

Uploaded Python 3

File details

Details for the file dl_data_analysis-0.3.0.1.tar.gz.

File metadata

  • Download URL: dl_data_analysis-0.3.0.1.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.5

File hashes

Hashes for dl_data_analysis-0.3.0.1.tar.gz
Algorithm Hash digest
SHA256 85ab2f422c2efa01b499ffbb1cad6ed308c852474b240560e43e49a5debbcb91
MD5 d5b95cb545871abd0f8ba6195fa7e300
BLAKE2b-256 02b1c5514338ed348acbb0bf864e233b889a7c817c04ac3343e817843796b2bc

See more details on using hashes here.

File details

Details for the file dl_data_analysis-0.3.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for dl_data_analysis-0.3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ae3fa78ab006904f77600b9bdc19cc067987c60f8305eeed9487ea1b6c2bb07c
MD5 46ef9aa9d229e0ac0ece11db69058ebe
BLAKE2b-256 b8b6fe531890ea3e358f959d703f82c9693965e9633d4e30d8bb8c1444d8e2bb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page