lightweight library that provides functionalities for common EDA tasks
Project description
Edazer
Edazer is a lightweight package that provides functionalities for common EDA tasks. It helps you quickly understand, summarize, and inspect your datasets with minimal code.
Features
- Quick DataFrame Summaries: Instantly view info, describe, nulls, duplicates, and shape using
summarymethod - Unique Value Inspection: Easily display unique values for any or all columns.
- Type-based Column Selection: Find columns by dtype (e.g., numeric, categorical).
- Flexible Subsetting: Use the
lookupmethod to view head, tail, or random samples. - Custom DataFrame Naming: Track multiple DataFrames with custom names for clarity.
Installation
pip install edazer
Quick Start with Titanic Dataset
import seaborn as sns
from edazer import Edazer
# Load the Titanic dataset from seaborn
titanic = sns.load_dataset('titanic')
# Create an Edazer instance
titanic_eda = Edazer(titanic, name="titanic") # setting name useful when working with multiple dataframes
#Complete DataFrame summary: info | descriptive statistics | nulls| duplicates | uniques | shape
titanic_eda.summarize_df()
# Show unique values for selected columns
titanic_eda.show_unique_values(column_names=['class', 'embarked'], max_unique=5)
# Get columns with float dtype
print(titanic_eda.cols_with_dtype(['float']))
#Combine multiple methods
titanic_dz.show_unique_values(column_names=titanic_dz.cols_with_dtype(dtypes=["object"]))
# Display the first few rows
print(titanic_eda.lookup("head"))
API Reference
Edazer(df: pd.DataFrame, name: str = None)
- df: The pandas DataFrame to analyze.
- name: Optional name for the DataFrame (useful when working with many DataFrames).
Methods
summarize_df(): Print a summary (info, describe, nulls, duplicates, shape).show_unique_values(column_names=None, max_unique=10): Show unique values for specified columns.cols_with_dtype(dtypes): Return columns matching the given dtypes.lookup(option="head"): Return a subset of the DataFrame (head,tail, orsample).
Example Output
titanic_eda.show_unique_values(column_names=titanic_dz.cols_with_dtype(dtypes=["object"]))
# Output:
sex: ['male', 'female']
embarked: ['S', 'C', 'Q', nan]
who: ['man', 'woman', 'child']
embark_town: ['Southampton', 'Cherbourg', 'Queenstown', nan]
alive: ['no', 'yes']
Contributing
Contributions are welcome! Please open issues or pull requests on Github
License
MIT License
Author
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
edazer-0.1.1.tar.gz
(5.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file edazer-0.1.1.tar.gz.
File metadata
- Download URL: edazer-0.1.1.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a47b9f11f63246305343fbce27069d87a233d960d6bb9c535fd12d7b07aacae9
|
|
| MD5 |
725d7c5297d16ea77fbad88d05c3e5ea
|
|
| BLAKE2b-256 |
4b170d40960f2bed2bc3453ca1d8e9408689d6d5a2ee4c053359c72fc537cc5a
|
File details
Details for the file edazer-0.1.1-py3-none-any.whl.
File metadata
- Download URL: edazer-0.1.1-py3-none-any.whl
- Upload date:
- Size: 5.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
408282dc6a1200ae7795cafbec12d8a6f992dd60a820b890373e16933d555827
|
|
| MD5 |
fdd90ef8e58f2571a2de56a8f2175b5b
|
|
| BLAKE2b-256 |
6a87b24abfeae7087815cd1cc956a8ad530c0aad6e9e77e997e45a26c097498c
|