Achieve error-rate parity between protected groups for any predictor
Project description
error-parity
Work presented as an oral at ICLR 2024, titled "Unprocessing Seven Years of Algorithmic Fairness".
Fast postprocessing of any score-based predictor to meet fairness criteria.
The error-parity
package can achieve strict or relaxed fairness constraint fulfillment,
which can be useful to compare ML models at equal fairness levels.
Package documentation available here.
Contents:
Installing
Install package from PyPI:
pip install error-parity
Or, for development, you can clone the repo and install from local sources:
git clone https://github.com/socialfoundations/error-parity.git
pip install ./error-parity
Getting started
See detailed example notebooks under the examples folder and on the package documentation.
from error_parity import RelaxedThresholdOptimizer
# Given any trained model that outputs real-valued scores
fair_clf = RelaxedThresholdOptimizer(
predictor=lambda X: model.predict_proba(X)[:, -1], # for sklearn API
# predictor=model, # use this for a callable model
constraint="equalized_odds", # other constraints are available
tolerance=0.05, # fairness constraint tolerance
)
# Fit the fairness adjustment on some data
# This will find the optimal _fair classifier_
fair_clf.fit(X=X, y=y, group=group)
# Now you can use `fair_clf` as any other classifier
# You have to provide group information to compute fair predictions
y_pred_test = fair_clf(X=X_test, group=group_test)
How it works
Given a callable score-based predictor (i.e., y_pred = predictor(X)
), and some (X, Y, S)
data to fit, RelaxedThresholdOptimizer
will:
- Compute group-specific ROC curves and their convex hulls;
- Compute the $r$-relaxed optimal solution for the chosen fairness criterion (using cvxpy);
- Find the set of group-specific binary classifiers that match the optimal solution found.
- each group-specific classifier is made up of (possibly randomized) group-specific thresholds over the given predictor;
- if a group's ROC point is in the interior of its ROC curve, partial randomization of its predictions may be necessary.
Fairness constraints
You can choose specific fairness constraints via the constraint
key-word argument to
the RelaxedThresholdOptimizer
constructor.
The equation under each constraint details how it is evaluated, where $r$ is the
relaxation (or tolerance) and $\mathcal{S}$ is the set of sensitive groups.
Currently implemented fairness constraints:
- equalized odds (Hardt et al., 2016) [default];
- i.e., equal group-specific TPR and FPR;
- use
constraint="equalized_odds"
; - $\max_{a, b \in \mathcal{S}} \max_{y \in {0, 1}} \left( \mathbb{P}[\hat{Y}=1 | S=a, Y=y] - \mathbb{P}[\hat{Y}=1 | S=b, Y=y] \right) \leq r$
- other relaxations available by changing the
l_p_norm
parameter;
- equal opportunity;
- i.e., equal group-specific TPR;
- use
constraint="true_positive_rate_parity"
; - $\max_{a, b \in \mathcal{S}} \left( \mathbb{P}[\hat{Y}=1 | S=a, Y=1] - \mathbb{P}[\hat{Y}=1 | S=b, Y=1] \right) \leq r$
- predictive equality;
- i.e., equal group-specific FPR;
- use
constraint="false_positive_rate_parity"
; - $\max_{a, b \in \mathcal{S}} \left( \mathbb{P}[\hat{Y}=1 | S=a, Y=0] - \mathbb{P}[\hat{Y}=1 | S=b, Y=0] \right) \leq r$
- demographic parity;
- i.e., equal group-specific predicted prevalence;
- use
constraint="demographic_parity"
; - $\max_{a, b \in \mathcal{S}} \left( \mathbb{P}[\hat{Y}=1 | S=a] - \mathbb{P}[\hat{Y}=1 | S=b] \right) \leq r$
We welcome community contributions for cvxpy implementations of other fairness constraints.
Equalized odds relaxations
When using constraint="equalized_odds"
, different relaxations can be chosen by
altering the l_p_norm
parameter (which dictates how to compute the distance
between group-specific ROC points).
A few useful values:
l_p_norm=np.inf
[default] evaluates equalized-odds as the maximum between group-wise TPR and FPR differences (as shown above);l_p_norm=1
evaluates equalized-odds as the sum of absolute difference in group-wise TPR and FPR;- corresponds to twice the "average absolute odds" metric;
- accordingly, use twice the
tolerance
target to constrain theaverage_abs_odds_difference
;
The actual equalized odds constraint implemented is:
$\max_{a, b \in \mathcal{S}} \left\lVert ROC_a - ROC_b \right\rVert_p \leq r,$ where $ROC_a$ is the ROC point of group $S=a$ and $ROC_b$ is the ROC point of group $S=b$.
Citing
@inproceedings{
cruz2024unprocessing,
title={Unprocessing Seven Years of Algorithmic Fairness},
author={Andr{\'e} Cruz and Moritz Hardt},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=jr03SfWsBS}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for error_parity-0.3.11-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | efe0d9366bd8afa30584e9c30f78aef21be79b1669080f68ebcce0fdf8851cf6 |
|
MD5 | 8aaa1996bdc530de994f3f5eaa23ca78 |
|
BLAKE2b-256 | bee565b9cec0ca880ef9565e07d0f423b65551838891b0b86b11072b94a36e3c |