Plottah: Univariate Analysis
Project description
Copyright © 2024 by Boston Consulting Group. All rights reserved
SmartBanking Plotting tool
A Python package for generating standardized univariate analysis plots and visualizations for SmartBanking analysis. The package automatically generates:
- ROC curves
- Distribution plots
- Bin event rate plots
Installation
There are two ways to use this package:
1. Install from PyPI (Recommended)
If you just want to use the package as a library:
pip install plottah
2. Local Development with Config (For Batch Processing)
If you want to process multiple features using a configuration file:
- Clone the repository:
git clone git@github.com:nielsota/plottah.git
cd plottah
- Install Poetry (if you haven't already):
curl -sSL https://install.python-poetry.org | python3 -
- Install dependencies and set up the development environment:
poetry install
Usage
As a Python Package
from plottah import build_univariate_plot
import pandas as pd
# Create your dataframe
df = pd.DataFrame(...)
# Generate a single plot
plot = build_univariate_plot(
df=df,
feature_col="your_feature",
target="target_column",
feature_type="numerical", # or "categorical"
n_bins=10, # optional
distplot_q_min=0.01, # optional
distplot_q_max=0.99 # optional
)
Using Configuration File
- Create a
config.yamlfile with your settings:
file_path: ./data/your_data.csv
images_output_path: ./data/images
powerpoint_output_path: ./data/powerpoints/output.pptx
features:
- name: feature_1
n_bins: 10
- name: feature_2
bins: [0, 1, 5, 10, 100]
- name: feature_3
type: categorical
target: target_column
# Optional: Custom colors
primary_color: 231, 30, 87
secondary_color: 153, 204, 235
tertiary_color: 254, 189, 64
grey_tint_color: 110, 111, 115
- Run the plotting tool:
poetry run python -m plottah
Configuration Options
For each feature in the config file, you can specify:
name: Feature column name (required)type: "numerical" or "categorical" (default: "numerical")n_bins: Number of bins for numerical featuresbins: Custom bin edges for numerical featuresdistplot_q_min: Lower quantile for distribution plot trimmingdistplot_q_max: Upper quantile for distribution plot trimming
Examples
See the notebooks/ directory for detailed examples of:
- Basic univariate analysis
- Custom binning strategies
- Distribution plot customization
- ROC curve analysis
Requirements
- Python >=3.8
- Poetry for development setup
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file plottah-1.3.2.tar.gz.
File metadata
- Download URL: plottah-1.3.2.tar.gz
- Upload date:
- Size: 24.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.10.12 Darwin/23.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f3b8998ac67060c287ae995e689e887694da09849296215f8f0f5057621d8ee
|
|
| MD5 |
f2f30ce87035b0895d3d39e3fffe1cd4
|
|
| BLAKE2b-256 |
2ce87a2e0bc1d7e4ef7d5b6476dec4e530d28bb4384281b37bd81dbd8207ea3e
|
File details
Details for the file plottah-1.3.2-py3-none-any.whl.
File metadata
- Download URL: plottah-1.3.2-py3-none-any.whl
- Upload date:
- Size: 29.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.10.12 Darwin/23.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c3010222773cdd64fd3c70d56213fc684b2518a9bfa11f869510dead3c915c6
|
|
| MD5 |
9aebf6b84fb54d41ecae91f67ce74bd0
|
|
| BLAKE2b-256 |
d6c8a0a987c41164df643196b1b9027cc736f5667231afd5b793c786c6602d22
|