Skip to main content

Plottah: Univariate Analysis

Project description

Copyright © 2024 by Boston Consulting Group. All rights reserved

SmartBanking Plotting tool

A Python package for generating standardized univariate analysis plots and visualizations for SmartBanking analysis. The package automatically generates:

  • ROC curves
  • Distribution plots
  • Bin event rate plots

Installation

There are two ways to use this package:

1. Install from PyPI (Recommended)

If you just want to use the package as a library:

pip install plottah

2. Local Development with Config (For Batch Processing)

If you want to process multiple features using a configuration file:

  1. Clone the repository:
git clone git@github.com:nielsota/plottah.git
cd plottah
  1. Install Poetry (if you haven't already):
curl -sSL https://install.python-poetry.org | python3 -
  1. Install dependencies and set up the development environment:
poetry install

Usage

As a Python Package

from plottah import build_univariate_plot
import pandas as pd

# Create your dataframe
df = pd.DataFrame(...)

# Generate a single plot
plot = build_univariate_plot(
    df=df,
    feature_col="your_feature",
    target="target_column",
    feature_type="numerical",  # or "categorical"
    n_bins=10,  # optional
    distplot_q_min=0.01,  # optional
    distplot_q_max=0.99   # optional
)

Using Configuration File

  1. Create a config.yaml file with your settings:
file_path: ./data/your_data.csv
images_output_path: ./data/images
powerpoint_output_path: ./data/powerpoints/output.pptx

features:
  - name: feature_1
    n_bins: 10
  - name: feature_2
    bins: [0, 1, 5, 10, 100]
  - name: feature_3
    type: categorical

target: target_column

# Optional: Custom colors
primary_color: 231, 30, 87
secondary_color: 153, 204, 235
tertiary_color: 254, 189, 64
grey_tint_color: 110, 111, 115
  1. Run the plotting tool:
poetry run python -m plottah

Configuration Options

For each feature in the config file, you can specify:

  • name: Feature column name (required)
  • type: "numerical" or "categorical" (default: "numerical")
  • n_bins: Number of bins for numerical features
  • bins: Custom bin edges for numerical features
  • distplot_q_min: Lower quantile for distribution plot trimming
  • distplot_q_max: Upper quantile for distribution plot trimming

Examples

See the notebooks/ directory for detailed examples of:

  • Basic univariate analysis
  • Custom binning strategies
  • Distribution plot customization
  • ROC curve analysis

Requirements

  • Python >=3.8
  • Poetry for development setup

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plottah-1.3.2.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plottah-1.3.2-py3-none-any.whl (29.5 kB view details)

Uploaded Python 3

File details

Details for the file plottah-1.3.2.tar.gz.

File metadata

  • Download URL: plottah-1.3.2.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.10.12 Darwin/23.5.0

File hashes

Hashes for plottah-1.3.2.tar.gz
Algorithm Hash digest
SHA256 0f3b8998ac67060c287ae995e689e887694da09849296215f8f0f5057621d8ee
MD5 f2f30ce87035b0895d3d39e3fffe1cd4
BLAKE2b-256 2ce87a2e0bc1d7e4ef7d5b6476dec4e530d28bb4384281b37bd81dbd8207ea3e

See more details on using hashes here.

File details

Details for the file plottah-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: plottah-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 29.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.10.12 Darwin/23.5.0

File hashes

Hashes for plottah-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5c3010222773cdd64fd3c70d56213fc684b2518a9bfa11f869510dead3c915c6
MD5 9aebf6b84fb54d41ecae91f67ce74bd0
BLAKE2b-256 d6c8a0a987c41164df643196b1b9027cc736f5667231afd5b793c786c6602d22

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page