Skip to main content

Achieve error-rate parity between protected groups for any predictor

Project description

error-parity

Tests status PyPI status Documentation status PyPI version OSI license Python compatibility

Work presented as an oral at ICLR 2024, titled "Unprocessing Seven Years of Algorithmic Fairness".

Fast postprocessing of any score-based predictor to meet fairness criteria.

The error-parity package can achieve strict or relaxed fairness constraint fulfillment, which can be useful to compare ML models at equal fairness levels.

Package documentation available here.

Contents:

Installing

Install package from PyPI:

pip install error-parity

Or, for development, you can clone the repo and install from local sources:

git clone https://github.com/socialfoundations/error-parity.git
pip install ./error-parity

Getting started

See detailed example notebooks under the examples folder and on the package documentation.

from error_parity import RelaxedThresholdOptimizer

# Given any trained model that outputs real-valued scores
fair_clf = RelaxedThresholdOptimizer(
    predictor=lambda X: model.predict_proba(X)[:, -1],   # for sklearn API
    # predictor=model,            # use this for a callable model
    constraint="equalized_odds",  # other constraints are available
    tolerance=0.05,               # fairness constraint tolerance
)

# Fit the fairness adjustment on some data
# This will find the optimal _fair classifier_
fair_clf.fit(X=X, y=y, group=group)

# Now you can use `fair_clf` as any other classifier
# You have to provide group information to compute fair predictions
y_pred_test = fair_clf(X=X_test, group=group_test)

How it works

Given a callable score-based predictor (i.e., y_pred = predictor(X)), and some (X, Y, S) data to fit, RelaxedThresholdOptimizer will:

  1. Compute group-specific ROC curves and their convex hulls;
  2. Compute the $r$-relaxed optimal solution for the chosen fairness criterion (using cvxpy);
  3. Find the set of group-specific binary classifiers that match the optimal solution found.
    • each group-specific classifier is made up of (possibly randomized) group-specific thresholds over the given predictor;
    • if a group's ROC point is in the interior of its ROC curve, partial randomization of its predictions may be necessary.

Fairness constraints

You can choose specific fairness constraints via the constraint key-word argument to the RelaxedThresholdOptimizer constructor. The equation under each constraint details how it is evaluated, where $r$ is the relaxation (or tolerance) and $\mathcal{S}$ is the set of sensitive groups.

Currently implemented fairness constraints:

  • equalized odds (Hardt et al., 2016) [default];
    • i.e., equal group-specific TPR and FPR;
    • use constraint="equalized_odds";
    • $\max_{a, b \in \mathcal{S}} \max_{y \in {0, 1}} \left( \mathbb{P}[\hat{Y}=1 | S=a, Y=y] - \mathbb{P}[\hat{Y}=1 | S=b, Y=y] \right) \leq r$
    • other relaxations available by changing the l_p_norm parameter;
  • equal opportunity;
    • i.e., equal group-specific TPR;
    • use constraint="true_positive_rate_parity";
    • $\max_{a, b \in \mathcal{S}} \left( \mathbb{P}[\hat{Y}=1 | S=a, Y=1] - \mathbb{P}[\hat{Y}=1 | S=b, Y=1] \right) \leq r$
  • predictive equality;
    • i.e., equal group-specific FPR;
    • use constraint="false_positive_rate_parity";
    • $\max_{a, b \in \mathcal{S}} \left( \mathbb{P}[\hat{Y}=1 | S=a, Y=0] - \mathbb{P}[\hat{Y}=1 | S=b, Y=0] \right) \leq r$
  • demographic parity;
    • i.e., equal group-specific predicted prevalence;
    • use constraint="demographic_parity";
    • $\max_{a, b \in \mathcal{S}} \left( \mathbb{P}[\hat{Y}=1 | S=a] - \mathbb{P}[\hat{Y}=1 | S=b] \right) \leq r$

We welcome community contributions for cvxpy implementations of other fairness constraints.

Equalized odds relaxations

When using constraint="equalized_odds", different relaxations can be chosen by altering the l_p_norm parameter (which dictates how to compute the distance between group-specific ROC points).

A few useful values:

  • l_p_norm=np.inf [default] evaluates equalized-odds as the maximum between group-wise TPR and FPR differences (as shown above);
  • l_p_norm=1 evaluates equalized-odds as the sum of absolute difference in group-wise TPR and FPR;
    • corresponds to twice the "average absolute odds" metric;
    • accordingly, use twice the tolerance target to constrain the average_abs_odds_difference;

The actual equalized odds constraint implemented is:

$\max_{a, b \in \mathcal{S}} \left\lVert ROC_a - ROC_b \right\rVert_p \leq r,$ where $ROC_a$ is the ROC point of group $S=a$ and $ROC_b$ is the ROC point of group $S=b$.

Citing

@inproceedings{
  cruz2024unprocessing,
  title={Unprocessing Seven Years of Algorithmic Fairness},
  author={Andr{\'e} Cruz and Moritz Hardt},
  booktitle={The Twelfth International Conference on Learning Representations},
  year={2024},
  url={https://openreview.net/forum?id=jr03SfWsBS}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

error_parity-0.3.12.tar.gz (39.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

error_parity-0.3.12-py3-none-any.whl (37.2 kB view details)

Uploaded Python 3

File details

Details for the file error_parity-0.3.12.tar.gz.

File metadata

  • Download URL: error_parity-0.3.12.tar.gz
  • Upload date:
  • Size: 39.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for error_parity-0.3.12.tar.gz
Algorithm Hash digest
SHA256 48d7f805df670074bc8c0f59904995465b595daa7dab5f0fb8add80eee88dc0b
MD5 20857f0834d69078d53a8b6fe31424aa
BLAKE2b-256 ae8847038f4782c6cd0a131334bf1fedf49a2443c42976276e1199e5c3b97a88

See more details on using hashes here.

File details

Details for the file error_parity-0.3.12-py3-none-any.whl.

File metadata

  • Download URL: error_parity-0.3.12-py3-none-any.whl
  • Upload date:
  • Size: 37.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for error_parity-0.3.12-py3-none-any.whl
Algorithm Hash digest
SHA256 93b2e1f17f58c84354bc8e38362c108a1999728f1d9d25003645363877cd041f
MD5 8a2086c5282963c8c24d5e8eee823c70
BLAKE2b-256 11019c9293e9cde64f2eee36e0bbb39c428307e7fa01fb0f2c73cd4bd9355fa9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page