Skip to main content

Library that provides helperfunctions for data science preprocessing and exploratory data analysis.

Project description

jan883-eda

A collection of utility functions for data analysis, preprocessing, model evaluation, and clustering in Python. Designed to streamline the workflow of data scientists and machine learning practitioners.

Installation

Install the package via pip:

pip install jan883-eda

Usage

Below are examples demonstrating how to use some of the key functions in the package. These examples assume you have a DataFrame (your_dataframe) or feature matrix (X) and target vector (y) ready.

Exploratory Data Analysis (EDA)

  • Inspect DataFrame:
from jan883_eda import inspect_df

inspect_df(your_dataframe)

This displays the head, shape, description, NaN values, and duplicates of the DataFrame.

  • Column Summary:
from jan883_eda import column_summary

summary = column_summary(your_dataframe)
print(summary)

Data Preprocessing

  • Update Column Names:
from jan883_eda import update_column_names

updated_df = update_column_names(your_dataframe)
  • Label Encoding:
from jan883_eda import label_encode_column

encoded_df = label_encode_column(your_dataframe, 'column_name')

Model Evaluation

  • Evaluate Classification Model:
from jan883_eda import evaluate_classification_model
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier()
evaluate_classification_model(model, X, y)
  • Test Multiple Regression Models:
from jan883_eda import best_regression_models

results = best_regression_models(X, y)
print(results)

Functions Overview

The package provides a variety of functions grouped by their purpose:

  • EDA Functions: inspect_df, column_summary, univariate_analysis, and more.
  • Data Preprocessing: update_column_names, label_encode_column, one_hot_encode_column, scale_X_train_X_test, and more.
  • Model Evaluation: evaluate_classification_model, evaluate_regression_model, best_classification_models, best_regression_models, and more.
  • Clustering Analysis: plot_elbow_method, plot_intercluster_distance, plot_silhouette_visualizer, and more.

For a complete list of functions and their detailed documentation, refer to the docstrings within the source code or the official documentation.

Requirements

The following dependencies are required to use the package:

  • Python >= 3.6
  • pandas >= 1.0.0
  • numpy >= 1.18.0
  • matplotlib >= 3.0.0
  • seaborn >= 0.10.0
  • scikit-learn >= 0.22.0
  • yellowbrick >= 1.0.0
  • imblearn >= 0.7.0

These will be automatically installed when you install the package via pip, assuming the package is properly configured with a setup.py or pyproject.toml file.

License

This package is distributed under the MIT License. See the LICENSE file for more information.

Contact

For questions, bug reports, or contributions, please visit the GitHub repository or contact the author at email@example.com.


This README.md provides a clear and concise overview of the package, including its purpose, installation instructions, usage examples, function categories, dependencies, licensing, and contact information, making it suitable for PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jan883_eda-0.2.0.tar.gz (57.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jan883_eda-0.2.0-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file jan883_eda-0.2.0.tar.gz.

File metadata

  • Download URL: jan883_eda-0.2.0.tar.gz
  • Upload date:
  • Size: 57.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.14

File hashes

Hashes for jan883_eda-0.2.0.tar.gz
Algorithm Hash digest
SHA256 366857fbc9d3c16d15ced282a154d6c92d691701edc2f45a1cfa978f2ca0e863
MD5 4294bea394d14241e24d0a94a85ec629
BLAKE2b-256 f1e20bad807cf48f71a3d41999da12cbba028d3612042293d02aef6499504fe2

See more details on using hashes here.

File details

Details for the file jan883_eda-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: jan883_eda-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 24.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.14

File hashes

Hashes for jan883_eda-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 826ed687e9518c8da91c3e17c61eb8d106cab5559298f36a131b2af18ee0fd34
MD5 d18f817a67772a0ef5510c33ae0b29af
BLAKE2b-256 2db86444246b60385d0002bcbe20119ddf5f59ef8bc9df4c1c2e713e4a6ef511

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page