Empirical Dependence Plot (EDP): a model-free EDA plot of a target's conditional mean/rate across a feature.
Project description
edp-tool — Empirical Dependence Plot (EDP)
A small exploratory-data-analysis tool that shows how a target variable behaves across grouped values of a feature. It is designed with categorical / binary targets in mind (it plots the observed class rate), but it also works for continuous targets (it plots the mean).
EDP vs. PDP — why the name changed
This tool was originally called a Partial Dependence Plot (PDP), but that name is technically inaccurate. A true PDP requires a fitted model and marginalizes its predictions over the other features.
The EDP uses no model and does no marginalization — it plots the observed conditional mean of the target given a single feature, straight from the data. That makes it closely related to the M-plot (marginal plot) from the ALE literature (Apley & Zhu, 2020). Calling it Empirical Dependence Plot reflects exactly what it computes.
The old pdp function still works as a deprecated alias (see below).
Install
Just copy edp_tool.py next to your notebook, or from Colab:
import os
url = 'https://raw.githubusercontent.com/attilalr/edp-tool/main/edp_tool.py'
if not os.path.isfile('edp_tool.py'):
!wget -q {url}
from edp_tool import edp
Requirements: numpy, pandas, matplotlib (see requirements.txt).
Usage
from edp_tool import edp
# Binary target -> positive-class rate per bin, with a Wilson CI band
edp(df, ['age', 'income'], 'converted')
# Multiclass target -> one line per class
edp(df, ['petal length (cm)'], 'species')
# Continuous target -> mean per bin, with a standard-error band
edp(df, ['age'], 'price')
# Save figures instead of showing them
edp(df, features, 'target', writefolder='figs')
edp() returns a list of dicts (feature, fig, ax, and path when saved),
so you can post-process or embed the figures.
Key parameters
| Parameter | Default | Meaning |
|---|---|---|
n |
4 |
Number of bins for continuous features (upper bound) |
kind |
"auto" |
"auto" / "continuous" / "categorical" treatment per feature |
binning |
"quantile" |
"quantile" or "uniform" bin edges |
ci |
"auto" |
"wilson", "sem", "auto", or None (Wilson for classes, SEM for regression) |
max_categories |
10 |
Numeric columns with ≤ this many distinct values are treated as categorical |
show_bincount |
True |
Draw per-bin sample count on a secondary axis |
show_baseline |
True |
Draw the global target mean/rate as a reference line |
ylim_origin |
True |
Start the y-axis at 0 |
even_spaced_ticks |
False |
Place continuous bins at real midpoints |
writefolder |
None |
Save PNGs to this folder instead of showing |
Multiclass targets are handled natively — no manual one-hot encoding needed.
What's new in this version
- Renamed to EDP (
edp_tool.edp);pdp_tool.pdpkept as a deprecated alias. - Fixed the dead categorical branch (feature type is now detected correctly).
- Fixed
nleaking across features and the maximum value being dropped from the last bin. - Native multiclass support, Wilson confidence intervals for rates, optional baseline line and uniform binning.
- Figures are returned and properly closed; validation raises real exceptions.
Development
pip install -r requirements.txt pytest
pytest
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file edp_tool-0.2.0.tar.gz.
File metadata
- Download URL: edp_tool-0.2.0.tar.gz
- Upload date:
- Size: 11.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5b08bbcadacdfd5f9de348a8300d9b2f8c76bfba87fcaaadf9d48a53453a688
|
|
| MD5 |
1847b33ca2b35f04970218b37c5ab9f9
|
|
| BLAKE2b-256 |
eabb3187c7f91560fb7b00e8ac8536aee9c56ddca5958c98f903495a7ab197ef
|
File details
Details for the file edp_tool-0.2.0-py3-none-any.whl.
File metadata
- Download URL: edp_tool-0.2.0-py3-none-any.whl
- Upload date:
- Size: 10.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd3c5ada0ed1bd7b831d41a12ea4f476fed6fcaf2c56b9b272fabeef7e53320a
|
|
| MD5 |
bf42f55558f11e0d0046189671b51310
|
|
| BLAKE2b-256 |
8b49103add8b52a8b25825df95b1b35822a7a72723af90c518a33124c6e13ddc
|