Skip to main content

A glass-box machine learning toolbox for interpretable pipelines

Project description

Glazzbocks

A transparent, interpretable machine learning framework

Glazzbocks (pronounced "glass box") provides a modular and fully auditable pipeline for building, diagnosing, and interpreting machine learning models. Designed to comply with real-world regulatory and interpretability demands—particularly in finance, healthcare, and insurance—Glazzbocks enables practitioners to go beyond accuracy and deliver insights that are explainable, defensible, and production-ready.


Why "Glass Box" ML?

Modern machine learning offers unprecedented predictive power, but too often at the cost of transparency. In high-stakes or regulated environments, this trade-off is unacceptable.

Glazzbocks is built on the principle that powerful models should also be interpretable. Every component—from preprocessing to diagnostics and interpretation—is designed to remain visible, explainable, and auditable.

Rather than obscuring internal logic behind black-box pipelines, this framework promotes transparent, modular ML development where every decision and output can be inspected, traced, and justified.


Industry Relevance

Many domains face legal, ethical, or operational requirements that demand model explainability. Glazzbocks is particularly suited for:

Finance and Credit Risk

  • Explain loan decisions using coefficients, SHAP, or PDP
  • Comply with fair lending regulations (e.g., ECOA, FCRA)
  • Audit model outputs for disparate impact

Healthcare and Life Sciences

  • Support clinical decision-making with interpretable diagnostics
  • Align with FDA guidance on algorithmic risk and bias
  • Validate performance without opaque heuristics

Insurance Underwriting and Claims

  • Reveal why customers are rated differently
  • Justify risk assessments during regulatory reviews
  • Provide human-interpretable justifications to customers

Key Advantages of Glazzbocks

  • Full Interpretability: Native support for feature importances, coefficients, SHAP values, PDPs, and permutation importances
  • Auditable Pipelines: Clear step-by-step ML workflows using modular, scikit-learn-compatible structures
  • Built for Compliance: Enables traceability for data transformations, model decisions, and performance metrics
  • Diagnostic Depth: Includes error distributions, lift charts, cumulative gain, VIF analysis, and more
  • Human-Centric Development: Designed for data scientists, analysts, and auditors who need to understand and explain model behavior—not just optimize accuracy

Components

ML_Pipeline.py

End-to-end automation for classification and regression:

  • Handles preprocessing of numerical and categorical features
  • Supports any scikit-learn compatible model
  • Includes train/test split and pipeline building
  • Performs cross-validation with detailed fold-wise metrics
  • Stores ROC, precision-recall, and threshold analysis (for classifiers)
  • Summarizes cross-validated performance across models

diagnostics.py

Automated performance diagnostics after training:

  • Classification: ROC, Confusion Matrix, F1 vs Threshold, Lift Chart, Gain Chart
  • Regression: Predicted vs Actual, Residual Plot, Error Distribution, Q-Q Plot
  • Auto-detects task type and generates all relevant visuals

modelinterpreter.py

Model interpretation & explainability utilities:

  • Tree-based models: Feature importances
  • Linear models: Coefficients (with plot support)
  • SHAP summary plots (supports pipelines)
  • Partial Dependence Plots (PDP)
  • Permutation Importance

dataexplorer.py

Exploratory Data Analysis (EDA) for modeling decisions:

  • Auto-detects task type (regression/classification)
  • Displays shape, dtypes, missing values (via missingno matrix)
  • Visualizes target distribution
  • Correlation heatmap
  • VIF for multicollinearity detection
  • Skewness and normality testing
  • Outlier detection (via z-score)
  • Entropy calculation (classification only)
  • Automatically extracts datetime features (year, month, day, weekday)
  • Provides modeling guidance (e.g., transformation hints, imbalance warning)

Example Usage

from pipeline import MLPipeline
from diagnostics import odelDiagnostics
from interpreter import ModelInterpreter
from eda import DataExplorer

Notes

  • All components are sklearn-compatible and designed to integrate seamlessly.
  • All visualizations are built using matplotlib, seaborn, or shap.
  • Logging is optionally supported in ModelInterpreter for production.
  • Pipelines auto-handle transformed features for compatibility with SHAP/PDP.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glazzbocks-0.1.6.tar.gz (15.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

glazzbocks-0.1.6-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file glazzbocks-0.1.6.tar.gz.

File metadata

  • Download URL: glazzbocks-0.1.6.tar.gz
  • Upload date:
  • Size: 15.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for glazzbocks-0.1.6.tar.gz
Algorithm Hash digest
SHA256 7c3dffb4218e03bab8d634dd1a51eedd3a434205afe3d1cf97bcf9fcb4faf330
MD5 6eace6ec391eda0ebf62812b77430e13
BLAKE2b-256 84fe49f65f7884bc8735eff51772ba40e207dd08da1da42776601ac34a4cd481

See more details on using hashes here.

File details

Details for the file glazzbocks-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: glazzbocks-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for glazzbocks-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ac289e7004584dd77edda777a65edb44729d87906b8829f2962778931ba42a1c
MD5 571808b13a1998b1dfbb4c97f935636f
BLAKE2b-256 52272bb2e0a3a092ffa21608be5e839bded02c67b6f42b182de64f3b3e386a49

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page