Fairness-aware tools with a scikit-learn compatible API
Project description
scikit-fair
Fairness-aware machine learning toolkit with a scikit-learn compatible API.
scikit-fair (skfair) is a Python library for fairness-aware binary classification. It covers the full pipeline — preprocessing, evaluation, auditing, comparison, and experimentation — and integrates seamlessly with scikit-learn and imbalanced-learn workflows.
Documentation: https://jmcfig.github.io/scikit-fair/
Installation
pip install scikit-fair
Or install from source:
git clone https://github.com/jmcfig/scikit-fair.git
cd scikit-fair
pip install -e .
Requirements: Python >= 3.9, numpy >= 1.22, pandas >= 1.5, scikit-learn >= 1.3, imbalanced-learn >= 0.12, cvxpy >= 1.3.
Quick start
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from skfair.datasets import load_adult
from skfair.preprocessing import Massaging
from skfair.metrics import accuracy, disparate_impact, statistical_parity_difference
# 1. Load data
X, y = load_adult(return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 2. Baseline — no fairness preprocessing
clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
sens = X_test["sex"].values
print(f"Baseline — Accuracy: {accuracy(y_test.values, y_pred):.3f} "
f"DI: {disparate_impact(y_test.values, y_pred, sens):.3f} "
f"SPD: {statistical_parity_difference(y_test.values, y_pred, sens):.3f}")
# 3. Apply Massaging to reduce label bias
sampler = Massaging(sens_attr="sex", priv_group=1)
X_fair, y_fair = sampler.fit_resample(X_train, y_train)
clf_fair = LogisticRegression(max_iter=1000)
clf_fair.fit(X_fair, y_fair)
y_pred_fair = clf_fair.predict(X_test)
print(f"Fair — Accuracy: {accuracy(y_test.values, y_pred_fair):.3f} "
f"DI: {disparate_impact(y_test.values, y_pred_fair, sens):.3f} "
f"SPD: {statistical_parity_difference(y_test.values, y_pred_fair, sens):.3f}")
Algorithms
| Class | Family | Reference |
|---|---|---|
Reweighing |
Weighting | Kamiran & Calders (2012) |
FairBalance |
Weighting | Yu et al. (2024) |
ReweighingClassifier |
Meta-estimator | — |
FairBalanceClassifier |
Meta-estimator | — |
Massaging |
Label modification | Kamiran & Calders (2012) |
FairwayRemover |
Label modification | Fairway (2019) |
FairOversampling |
Oversampling | Dablan et al. |
FairSmote |
Oversampling | Chakraborty et al. (2021) |
FAWOS |
Oversampling | Salazar et al. (2021) |
HeterogeneousFOS |
Oversampling | Sonoda et al. (2023) |
DisparateImpactRemover |
Feature transformation | Feldman et al. (2015) |
OptimizedPreprocessing |
Feature transformation | Calmon et al. (2017) |
LearningFairRepresentations |
Feature transformation | Zemel et al. (2013) |
FairMask |
Meta-estimator | Peng et al. (2021) |
IntersectionalBinarizer |
Utility | — |
DropColumns |
Utility | — |
Usage patterns
Each family of algorithms has its own API contract.
Samplers — fit_resample(X, y)
Label-modification and oversampling methods return a resampled dataset. They extend imblearn.BaseSampler and work directly inside an imblearn.Pipeline.
from skfair.preprocessing import FairSmote
sampler = FairSmote(sens_attr="sex", random_state=0)
X_resampled, y_resampled = sampler.fit_resample(X_train, y_train)
Weighting methods — fit_transform(X, y)
Reweighing and FairBalance return the original X unchanged alongside a weight Series. Pass the weights to your classifier via sample_weight.
from skfair.preprocessing import Reweighing
rw = Reweighing(sens_attr="sex", priv_group=1)
X_unchanged, weights = rw.fit_transform(X_train, y_train)
clf = LogisticRegression(max_iter=1000)
clf.fit(X_unchanged, y_train, sample_weight=weights)
Classifier wrappers — standard fit / predict
ReweighingClassifier and FairBalanceClassifier encapsulate the weighting step inside a full sklearn-compatible classifier, including sample_weight handling.
from skfair.preprocessing import ReweighingClassifier
clf = ReweighingClassifier(
estimator=LogisticRegression(max_iter=1000),
sens_attr="sex",
priv_group=1,
)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
Feature transformers — fit_transform(X)
DisparateImpactRemover, OptimizedPreprocessing, and LearningFairRepresentations transform X directly and slot into sklearn.Pipeline as standard transformers.
from skfair.preprocessing import DisparateImpactRemover
repair = DisparateImpactRemover(
sensitive_attribute="sex",
repair_columns=["age", "hours-per-week"],
lambda_param=1.0,
)
X_repaired = repair.fit_transform(X_train)
example Pipeline
Combine preprocessing with downstream estimators, optionally using DropColumns to remove the sensitive attribute just before the classifier.
from imblearn.pipeline import Pipeline
from skfair.preprocessing import FairSmote, DropColumns
pipe = Pipeline([
("fair_smote", FairSmote(sens_attr="sex", random_state=42)),
("drop_sens", DropColumns("sex")), #optional
("classifier", LogisticRegression(solver="liblinear", max_iter=1000, random_state=42)),
])
pipe.fit(X_train, y_train)
y_pred = pipe.predict(X_test)
Tip: We recommend always using
imblearn.pipeline.Pipeline— it extends sklearn's Pipeline withfit_resamplesupport, so it works with all scikit-fair methods (transformers, samplers, and meta-estimators) without needing to switch imports.
Intersectional privilege
Define complex, multi-column privilege criteria with IntersectionalBinarizer.
from skfair.preprocessing import IntersectionalBinarizer
binarizer = IntersectionalBinarizer(
privileged_definition={"race": "White", "sex": "Male"},
group_col_name="_is_privileged",
)
X_with_group = binarizer.fit_transform(X_train)
Metrics
Nine group-fairness metrics and nine performance metrics share a unified signature: metric(y_true, y_pred, sensitive_attr).
Fairness metrics
| Function | Definition | Perfect value |
|---|---|---|
disparate_impact |
P(Y=1|S=0) / P(Y=1|S=1) | 1.0 |
statistical_parity_difference |
P(Y=1|S=0) - P(Y=1|S=1) | 0.0 |
equal_opportunity_difference |
TPR(S=0) - TPR(S=1) | 0.0 |
equal_opportunity_ratio |
TPR(S=0) / TPR(S=1) | 1.0 |
average_odds_difference |
0.5 x [(FPR diff) + (TPR diff)] | 0.0 |
true_negative_rate_difference |
TNR(S=0) - TNR(S=1) | 0.0 |
false_negative_rate_difference |
FNR(S=0) - FNR(S=1) | 0.0 |
predictive_equality |
FPR(S=0) / FPR(S=1) | 1.0 |
accuracy_parity |
Acc(S=0) / Acc(S=1) | 1.0 |
Performance metrics
accuracy, true_positive_rate, false_positive_rate, true_negative_rate, false_negative_rate, balanced_accuracy, precision, recall, f1_score.
from skfair.metrics import (
disparate_impact,
statistical_parity_difference,
equal_opportunity_difference,
predictive_equality,
accuracy,
balanced_accuracy,
precision,
recall,
f1_score,
)
sens = X_test["sex"].values
print(f"Accuracy: {accuracy(y_test.values, y_pred):.3f}")
print(f"Balanced accuracy: {balanced_accuracy(y_test.values, y_pred):.3f}")
print(f"Precision: {precision(y_test.values, y_pred):.3f}")
print(f"Recall: {recall(y_test.values, y_pred):.3f}")
print(f"F1 score: {f1_score(y_test.values, y_pred):.3f}")
print(f"Disparate impact: {disparate_impact(y_test.values, y_pred, sens):.3f}")
print(f"Stat. parity diff: {statistical_parity_difference(y_test.values, y_pred, sens):.3f}")
print(f"Equal opp. diff: {equal_opportunity_difference(y_test.values, y_pred, sens):.3f}")
print(f"Pred. equality: {predictive_equality(y_test.values, y_pred, sens):.3f}")
Datasets
Five standard fairness benchmarks are bundled.
| Loader | Samples | Features | Sensitive attribute | Label |
|---|---|---|---|---|
load_adult |
48 842 | 14 | sex (1 = male) |
income > 50k |
load_german |
1 000 | 20 | sex |
credit risk |
load_heart_disease |
740 | 13 | sex |
heart disease |
load_compas |
~7 214 | 11 | sex, race |
two-year recidivism |
load_ricci |
118 | 5 | Race |
promotion eligibility |
from skfair.datasets import load_adult, load_german, load_heart_disease, load_compas, load_ricci
X, y = load_adult(preprocessed=True)
X, y = load_german()
X, y = load_heart_disease()
X, y = load_compas()
X, y = load_ricci()
Audit
The audit module provides data-level and prediction-level fairness analysis.
BiasAuditor — pre-model data analysis
Examines sensitive-group proportions, target rates, and feature distributions before training.
from skfair.audit import BiasAuditor
auditor = BiasAuditor(X_train, y_train, sens_attr="sex")
print(auditor.group_proportions())
print(auditor.target_rate_by_group())
auditor.plot_summary()
FairnessAuditor — post-model prediction analysis
Evaluates how fair a model's predictions are across groups.
from skfair.audit import FairnessAuditor
fa = FairnessAuditor(y_test.values, y_pred, X_test["sex"].values)
print(fa.performance_by_group())
print(fa.fairness_metrics())
fa.plot_fairness_radar()
Comparison
The comparison module provides a ComparisonReport for comparing multiple preprocessing methods across datasets and classifiers.
ComparisonReport expects a DataFrame with the following columns:
| Column | Required | Description |
|---|---|---|
dataset |
yes | Dataset name (e.g. "adult", "compas") |
method |
yes | Preprocessing method name (e.g. "Massaging", "FairSmote") |
classifier |
yes | Classifier name (e.g. "LogReg") |
{metric} |
yes (at least one) | Value for each metric (e.g. accuracy, spd) |
{metric}_std |
no | Standard deviation — included when Experiment(std=True), not used by plots |
This is the format returned by Experiment.run(), but you can also build it manually.
from skfair.comparison import ComparisonReport
report = ComparisonReport(results_df)
# Summary tables — pivot of metric means per method, averaged over classifiers
tables = report.summary_tables()
# Performance bar charts
report.plot_metric_bar(metric="accuracy")
# Fairness bar chart for a single metric
report.plot_metric_bar(metric="spd")
# Accuracy vs |fairness| scatter — ideally a method sits in the top-right corner
report.plot_tradeoff(fairness_metric="spd", performance_metric="accuracy")
# Heatmap ranking methods per dataset across all metrics
report.plot_ranking()
# Or generate all plots at once
report.plot_all(fairness_metric="spd")
# Export a self-contained HTML report
report.to_html("report.html")
Experimentation
The experimentation module automates dataset x method x classifier experiments with cross-validation.
from skfair.experimentation import Experiment
exp = Experiment(
datasets=["adult", "compas"],
methods=["Massaging", "FairSmote", "ReweighingClassifier"],
n_splits=5,
)
results = exp.run()
# Generate a ComparisonReport
report = exp.to_report()
report.plot_metric_bar(metric="accuracy")
Experiments can also be configured via YAML files:
exp = Experiment.from_config("config.yaml")
results = exp.run()
Example notebooks
The examples/ folder contains step-by-step Jupyter notebooks that walk through every module:
| Notebook | Description |
|---|---|
01_datasets |
Loading, exploring, and preprocessing the bundled datasets |
02_methods |
Using fairness methods — transformers, samplers, and meta-estimators |
03_audit |
Pre-model bias analysis and post-model fairness auditing |
04_comparison |
Comparing methods side-by-side with ComparisonReport |
05_experiment |
Running cross-validated experiments with Experiment |
05a_experiment_config |
Configuring experiments from Python and YAML |
05b_custom_datasets |
Using custom (user-provided) datasets in experiments |
06_benchmark |
Full-scale benchmark driven by a YAML config file |
License
BSD 3-Clause. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scikit_fair-0.1.0.tar.gz.
File metadata
- Download URL: scikit_fair-0.1.0.tar.gz
- Upload date:
- Size: 69.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59749936805650e6449e8c2b32a49169ddb090be110eddce10585eda8986d7a0
|
|
| MD5 |
1fb29522f1b021c96fbe4c5db612add4
|
|
| BLAKE2b-256 |
f7e4157052210a35a37445fe7d1e0b68bacad7fb67620e505074b06044d08818
|
File details
Details for the file scikit_fair-0.1.0-py3-none-any.whl.
File metadata
- Download URL: scikit_fair-0.1.0-py3-none-any.whl
- Upload date:
- Size: 93.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3c777c7c55a386e160cfc9d69f50f7272211cb789049f736db7afd3b0566ae7
|
|
| MD5 |
4d9ec827da69b3f6a6ddbff1e689fcf8
|
|
| BLAKE2b-256 |
68d557a5659663d61e35db4fa55c426915645411d9819874f92a358d92057f8a
|