Skip to main content

Library that provides helperfunctions for data science preprocessing and exploratory data analysis.

Project description

jan883-eda

A collection of utility functions for data analysis, preprocessing, model evaluation, and clustering in Python. Designed to streamline the workflow of data scientists and machine learning practitioners.

Installation

Install the package via pip:

pip install jan883-eda

Usage

Below are examples demonstrating how to use some of the key functions in the package. These examples assume you have a DataFrame (your_dataframe) or feature matrix (X) and target vector (y) ready.

Exploratory Data Analysis (EDA)

  • Inspect DataFrame:
from jan883-eda import inspect_df

inspect_df(your_dataframe)

This displays the head, shape, description, NaN values, and duplicates of the DataFrame.

  • Column Summary:
from jan883-eda import column_summary

summary = column_summary(your_dataframe)
print(summary)

Data Preprocessing

  • Update Column Names:
from jan883-eda import update_column_names

updated_df = update_column_names(your_dataframe)
  • Label Encoding:
from jan883-eda import label_encode_column

encoded_df = label_encode_column(your_dataframe, 'column_name')

Model Evaluation

  • Evaluate Classification Model:
from jan883-eda import evaluate_classification_model
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier()
evaluate_classification_model(model, X, y)
  • Test Multiple Regression Models:
from jan883-eda import best_regression_models

results = best_regression_models(X, y)
print(results)

Functions Overview

The package provides a variety of functions grouped by their purpose:

  • EDA Functions: inspect_df, column_summary, univariate_analysis, and more.
  • Data Preprocessing: update_column_names, label_encode_column, one_hot_encode_column, scale_X_train_X_test, and more.
  • Model Evaluation: evaluate_classification_model, evaluate_regression_model, best_classification_models, best_regression_models, and more.
  • Clustering Analysis: plot_elbow_method, plot_intercluster_distance, plot_silhouette_visualizer, and more.

For a complete list of functions and their detailed documentation, refer to the docstrings within the source code or the official documentation.

Requirements

The following dependencies are required to use the package:

  • Python >= 3.6
  • pandas >= 1.0.0
  • numpy >= 1.18.0
  • matplotlib >= 3.0.0
  • seaborn >= 0.10.0
  • scikit-learn >= 0.22.0
  • yellowbrick >= 1.0.0
  • imblearn >= 0.7.0

These will be automatically installed when you install the package via pip, assuming the package is properly configured with a setup.py or pyproject.toml file.

License

This package is distributed under the MIT License. See the LICENSE file for more information.

Contact

For questions, bug reports, or contributions, please visit the GitHub repository or contact the author at email@example.com.


This README.md provides a clear and concise overview of the package, including its purpose, installation instructions, usage examples, function categories, dependencies, licensing, and contact information, making it suitable for PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jan883_eda-0.1.7.tar.gz (54.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jan883_eda-0.1.7-py3-none-any.whl (23.9 kB view details)

Uploaded Python 3

File details

Details for the file jan883_eda-0.1.7.tar.gz.

File metadata

  • Download URL: jan883_eda-0.1.7.tar.gz
  • Upload date:
  • Size: 54.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.13

File hashes

Hashes for jan883_eda-0.1.7.tar.gz
Algorithm Hash digest
SHA256 78b9e45bc15ae43b050bae33d2e2fece285941cc8c7ca11f31d500042b0e0c14
MD5 020444266e42634927a32c5429aa2d4a
BLAKE2b-256 9551faaeb0e7fd201f35f6c7adc651b6985ea050ffcf75dda8b8ce878bd8a6ee

See more details on using hashes here.

File details

Details for the file jan883_eda-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: jan883_eda-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 23.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.13

File hashes

Hashes for jan883_eda-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 4ce125f2213af5db148bc3e3f9bfb54b160e31dab08fcb3324f33cf3c5247733
MD5 90ea703a072f353522e05f0b19654150
BLAKE2b-256 194e97af5d4463827d14a6c025f5345bd133013ebd9f2b71d5d407149c966805

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page