lightweight library that provides functionalities for common EDA tasks
Project description
Edazer
Edazer is a lightweight package that provides functionalities for common EDA tasks. It helps you quickly understand, summarize, and inspect your datasets with minimal code.
Features
- Quick DataFrame Summaries: Instantly view info, describe, nulls, duplicates, and shape using
summarymethod - Unique Value Inspection: Easily display unique values for any or all columns.
- Type-based Column Selection: Find columns by dtype (e.g., numeric, categorical).
- Flexible Subsetting: Use the
lookupmethod to view head, tail, or random samples. - Custom DataFrame Naming: Track multiple DataFrames with custom names for clarity.
Installation
pip install edazer
Quick Start with Titanic Dataset
import seaborn as sns
from edazer import Edazer
# Load the Titanic dataset from seaborn
titanic = sns.load_dataset('titanic')
# Create an Edazer instance
titanic_eda = Edazer(titanic, name="titanic") # setting name useful when working with multiple dataframes
#Complete DataFrame summary: info | descriptive statistics | nulls| duplicates | uniques | shape
titanic_eda.summarize_df()
# Show unique values for selected columns
titanic_eda.show_unique_values(column_names=['class', 'embarked'], max_unique=5)
# Get columns with float dtype
print(titanic_eda.cols_with_dtype(['float']))
#Combine multiple methods
titanic_dz.show_unique_values(column_names=titanic_dz.cols_with_dtype(dtypes=["object"]))
# Display the first few rows
print(titanic_eda.lookup("head"))
API Reference
Edazer(df: pd.DataFrame, name: str = None)
- df: The pandas DataFrame to analyze.
- name: Optional name for the DataFrame (useful when working with many DataFrames).
Methods
summarize_df(): Print a summary (info, describe, nulls, duplicates, shape).show_unique_values(column_names=None, max_unique=10): Show unique values for specified columns.cols_with_dtype(dtypes): Return columns matching the given dtypes.lookup(option="head"): Return a subset of the DataFrame (head,tail, orsample).
Example Output
titanic_eda.show_unique_values(column_names=titanic_dz.cols_with_dtype(dtypes=["object"]))
# Output:
sex: ['male', 'female']
embarked: ['S', 'C', 'Q', nan]
who: ['man', 'woman', 'child']
embark_town: ['Southampton', 'Cherbourg', 'Queenstown', nan]
alive: ['no', 'yes']
Contributing
Contributions are welcome! Please open issues or pull requests on Github
License
MIT License
Author
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
edazer-0.1.0.tar.gz
(5.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file edazer-0.1.0.tar.gz.
File metadata
- Download URL: edazer-0.1.0.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
85b4a15f986b520dea327fcb86e1a867aca1d4da6e4494c85614d35b5c8a1b70
|
|
| MD5 |
59d9af3727d668c84a8498eff1759d88
|
|
| BLAKE2b-256 |
84166e897813051cb4ccbe62354ffbdd69736f1de636e90c48a524e2c574550b
|
File details
Details for the file edazer-0.1.0-py3-none-any.whl.
File metadata
- Download URL: edazer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
032b11209e5ede16d1be76ec96068c71433c7f7cba72a311f223960526bb093e
|
|
| MD5 |
cf107773b796ba62639af6db12c23e1d
|
|
| BLAKE2b-256 |
5c17d9e9238cab3db8a76cb23d5518071cdafecda59720974fc0f18d238e391a
|