Comprehensive Exploratory Data Analysis Pipeline
Project description
Great progress on setting up your package structure! Let's continue with the next steps to properly publish your EDAPipeline package.
Next Steps for Publishing Your EDAPipeline Package
1. Complete Your README.md
First, let's improve your README.md with proper documentation:
# EDAPipeline
A comprehensive Exploratory Data Analysis (EDA) toolkit that streamlines the process of analyzing datasets through visualization and statistical methods.
## Features
- Automated data type detection (numerical, categorical, datetime)
- Comprehensive univariate analysis for all data types
- Correlation analysis with heatmaps
- Bivariate analysis between different feature types
- Datetime feature decomposition and analysis
- Outlier detection using multiple methods
- Customizable visualization options
## Installation
```bash
pip install edapipeline
Quick Start
from edapipeline import EDAPipeline
import pandas as pd
# Load your dataset
df = pd.read_csv('your_data.csv')
# Initialize the pipeline
eda = EDAPipeline(df, target_col='your_target_column')
# Run the complete analysis
eda.run_complete_analysis()
# Or run specific analyses
eda.data_overview()
eda.analyze_numerical_features()
eda.correlation_analysis()
Dependencies
- numpy
- pandas
- matplotlib
- seaborn
- scipy
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
### 2. Create a Basic __init__.py File
Edit your `src/edapipeline/__init__.py` file:
```python
"""EDAPipeline - Comprehensive EDA toolkit for data analysis."""
from .core import EDAPipeline
from .__version__ import __version__
__all__ = ['EDAPipeline']
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file edapipeline-0.1.0.tar.gz.
File metadata
- Download URL: edapipeline-0.1.0.tar.gz
- Upload date:
- Size: 11.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf0803fa4082afb908cd8599763d5ed4da2d72c3d22ed443ffdb9e237c5ea016
|
|
| MD5 |
01ba45294dc94af88e69e619f26d0ac4
|
|
| BLAKE2b-256 |
d236720e6873a6164eb28e5ea22c124251424bcb1cddf64755fac1638e0bfa42
|
File details
Details for the file edapipeline-0.1.0-py3-none-any.whl.
File metadata
- Download URL: edapipeline-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e280664f2bd4baa89205f00ad44a54b165d7f8b6eb9eb0fa91e1e5c8ad92e450
|
|
| MD5 |
6202a8ca3b64c96f659d3e452c4b97c4
|
|
| BLAKE2b-256 |
472c4482acf4af0370ef484234004b729eb21fce961cadfcbfeb67a80f391b4d
|