Skip to main content

Library that provides helperfunctions for data science preprocessing and exploratory data analysis.

Project description

jan883-eda

A collection of utility functions for data analysis, preprocessing, model evaluation, and clustering in Python. Designed to streamline the workflow of data scientists and machine learning practitioners.

Installation

Install the package via pip:

pip install jan883-eda

Usage

Below are examples demonstrating how to use some of the key functions in the package. These examples assume you have a DataFrame (your_dataframe) or feature matrix (X) and target vector (y) ready.

Exploratory Data Analysis (EDA)

  • Inspect DataFrame:
from jan883_eda import inspect_df

inspect_df(your_dataframe)

This displays the head, shape, description, NaN values, and duplicates of the DataFrame.

  • Column Summary:
from jan883_eda import column_summary

summary = column_summary(your_dataframe)
print(summary)

Data Preprocessing

  • Update Column Names:
from jan883_eda import update_column_names

updated_df = update_column_names(your_dataframe)
  • Label Encoding:
from jan883_eda import label_encode_column

encoded_df = label_encode_column(your_dataframe, 'column_name')

Model Evaluation

  • Evaluate Classification Model:
from jan883_eda import evaluate_classification_model
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier()
evaluate_classification_model(model, X, y)
  • Test Multiple Regression Models:
from jan883_eda import best_regression_models

results = best_regression_models(X, y)
print(results)

Functions Overview

The package provides a variety of functions grouped by their purpose:

  • EDA Functions: inspect_df, column_summary, univariate_analysis, and more.
  • Data Preprocessing: update_column_names, label_encode_column, one_hot_encode_column, scale_X_train_X_test, and more.
  • Model Evaluation: evaluate_classification_model, evaluate_regression_model, best_classification_models, best_regression_models, and more.
  • Clustering Analysis: plot_elbow_method, plot_intercluster_distance, plot_silhouette_visualizer, and more.

For a complete list of functions and their detailed documentation, refer to the docstrings within the source code or the official documentation.

Requirements

The following dependencies are required to use the package:

  • Python >= 3.6
  • pandas >= 1.0.0
  • numpy >= 1.18.0
  • matplotlib >= 3.0.0
  • seaborn >= 0.10.0
  • scikit-learn >= 0.22.0
  • yellowbrick >= 1.0.0
  • imblearn >= 0.7.0

These will be automatically installed when you install the package via pip, assuming the package is properly configured with a setup.py or pyproject.toml file.

License

This package is distributed under the MIT License. See the LICENSE file for more information.

Contact

For questions, bug reports, or contributions, please visit the GitHub repository or contact the author at email@example.com.


This README.md provides a clear and concise overview of the package, including its purpose, installation instructions, usage examples, function categories, dependencies, licensing, and contact information, making it suitable for PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jan883_eda-0.1.8.tar.gz (54.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jan883_eda-0.1.8-py3-none-any.whl (23.9 kB view details)

Uploaded Python 3

File details

Details for the file jan883_eda-0.1.8.tar.gz.

File metadata

  • Download URL: jan883_eda-0.1.8.tar.gz
  • Upload date:
  • Size: 54.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.13

File hashes

Hashes for jan883_eda-0.1.8.tar.gz
Algorithm Hash digest
SHA256 7f22a7fdbd1cac53a94b4689c96c48654b7f10ec5c3c2f90ce5d71e2e5b8d142
MD5 d5efe80c9f378cfe234405cde0b8284c
BLAKE2b-256 636a1af5c5cc0b0f1d17794217f026464514df1f8cd15e3e188ff41d5a2925e6

See more details on using hashes here.

File details

Details for the file jan883_eda-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: jan883_eda-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 23.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.13

File hashes

Hashes for jan883_eda-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 3c0b5cd8a5c0ea2e193a218166c52f207ea6ed4b506ab180edcb172ede3e4958
MD5 bb40a7617acd0983a7a02ecf260d9f13
BLAKE2b-256 5936f55608bd112d590f4a388d327215d853135538d2231543eda94f70183a56

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page