Library that provides helperfunctions for data science preprocessing and exploratory data analysis.
Project description
jan883-eda
A collection of utility functions for data analysis, preprocessing, model evaluation, and clustering in Python. Designed to streamline the workflow of data scientists and machine learning practitioners.
Installation
Install the package via pip:
pip install jan883-eda
Usage
Below are examples demonstrating how to use some of the key functions in the package. These examples assume you have a DataFrame (your_dataframe) or feature matrix (X) and target vector (y) ready.
Exploratory Data Analysis (EDA)
- Inspect DataFrame:
from jan883_eda import inspect_df
inspect_df(your_dataframe)
This displays the head, shape, description, NaN values, and duplicates of the DataFrame.
- Column Summary:
from jan883_eda import column_summary
summary = column_summary(your_dataframe)
print(summary)
Data Preprocessing
- Update Column Names:
from jan883_eda import update_column_names
updated_df = update_column_names(your_dataframe)
- Label Encoding:
from jan883_eda import label_encode_column
encoded_df = label_encode_column(your_dataframe, 'column_name')
Model Evaluation
- Evaluate Classification Model:
from jan883_eda import evaluate_classification_model
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
evaluate_classification_model(model, X, y)
- Test Multiple Regression Models:
from jan883_eda import best_regression_models
results = best_regression_models(X, y)
print(results)
Functions Overview
The package provides a variety of functions grouped by their purpose:
- EDA Functions:
inspect_df,column_summary,univariate_analysis, and more. - Data Preprocessing:
update_column_names,label_encode_column,one_hot_encode_column,scale_X_train_X_test, and more. - Model Evaluation:
evaluate_classification_model,evaluate_regression_model,best_classification_models,best_regression_models, and more. - Clustering Analysis:
plot_elbow_method,plot_intercluster_distance,plot_silhouette_visualizer, and more.
For a complete list of functions and their detailed documentation, refer to the docstrings within the source code or the official documentation.
Requirements
The following dependencies are required to use the package:
- Python >= 3.6
- pandas >= 1.0.0
- numpy >= 1.18.0
- matplotlib >= 3.0.0
- seaborn >= 0.10.0
- scikit-learn >= 0.22.0
- yellowbrick >= 1.0.0
- imblearn >= 0.7.0
These will be automatically installed when you install the package via pip, assuming the package is properly configured with a setup.py or pyproject.toml file.
License
This package is distributed under the MIT License. See the LICENSE file for more information.
Contact
For questions, bug reports, or contributions, please visit the GitHub repository or contact the author at email@example.com.
This README.md provides a clear and concise overview of the package, including its purpose, installation instructions, usage examples, function categories, dependencies, licensing, and contact information, making it suitable for PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jan883_eda-0.2.0.tar.gz.
File metadata
- Download URL: jan883_eda-0.2.0.tar.gz
- Upload date:
- Size: 57.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
366857fbc9d3c16d15ced282a154d6c92d691701edc2f45a1cfa978f2ca0e863
|
|
| MD5 |
4294bea394d14241e24d0a94a85ec629
|
|
| BLAKE2b-256 |
f1e20bad807cf48f71a3d41999da12cbba028d3612042293d02aef6499504fe2
|
File details
Details for the file jan883_eda-0.2.0-py3-none-any.whl.
File metadata
- Download URL: jan883_eda-0.2.0-py3-none-any.whl
- Upload date:
- Size: 24.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
826ed687e9518c8da91c3e17c61eb8d106cab5559298f36a131b2af18ee0fd34
|
|
| MD5 |
d18f817a67772a0ef5510c33ae0b29af
|
|
| BLAKE2b-256 |
2db86444246b60385d0002bcbe20119ddf5f59ef8bc9df4c1c2e713e4a6ef511
|