Skip to main content

Comprehensive Exploratory Data Analysis Pipeline

Project description

Great progress on setting up your package structure! Let's continue with the next steps to properly publish your EDAPipeline package.

Next Steps for Publishing Your EDAPipeline Package

1. Complete Your README.md

First, let's improve your README.md with proper documentation:

# EDAPipeline

A comprehensive Exploratory Data Analysis (EDA) toolkit that streamlines the process of analyzing datasets through visualization and statistical methods.

## Features

- Automated data type detection (numerical, categorical, datetime)
- Comprehensive univariate analysis for all data types
- Correlation analysis with heatmaps
- Bivariate analysis between different feature types
- Datetime feature decomposition and analysis
- Outlier detection using multiple methods
- Customizable visualization options

## Installation

```bash
pip install edapipeline

Quick Start

from edapipeline import EDAPipeline
import pandas as pd

# Load your dataset
df = pd.read_csv('your_data.csv')

# Initialize the pipeline
eda = EDAPipeline(df, target_col='your_target_column')

# Run the complete analysis
eda.run_complete_analysis()

# Or run specific analyses
eda.data_overview()
eda.analyze_numerical_features()
eda.correlation_analysis()

Dependencies

  • numpy
  • pandas
  • matplotlib
  • seaborn
  • scipy

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.


### 2. Create a Basic __init__.py File

Edit your `src/edapipeline/__init__.py` file:

```python
"""EDAPipeline - Comprehensive EDA toolkit for data analysis."""

from .core import EDAPipeline
from .__version__ import __version__

__all__ = ['EDAPipeline']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edapipeline-0.1.0.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

edapipeline-0.1.0-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file edapipeline-0.1.0.tar.gz.

File metadata

  • Download URL: edapipeline-0.1.0.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for edapipeline-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bf0803fa4082afb908cd8599763d5ed4da2d72c3d22ed443ffdb9e237c5ea016
MD5 01ba45294dc94af88e69e619f26d0ac4
BLAKE2b-256 d236720e6873a6164eb28e5ea22c124251424bcb1cddf64755fac1638e0bfa42

See more details on using hashes here.

File details

Details for the file edapipeline-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: edapipeline-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for edapipeline-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e280664f2bd4baa89205f00ad44a54b165d7f8b6eb9eb0fa91e1e5c8ad92e450
MD5 6202a8ca3b64c96f659d3e452c4b97c4
BLAKE2b-256 472c4482acf4af0370ef484234004b729eb21fce961cadfcbfeb67a80f391b4d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page