Skip to main content

FairRankTune a python pacakage for fairness-aware data generation, metrics, and algorithms.

Project description

Home

PyPI version Documentation Status License: BSD 3-Clause

📍 Introduction

FairRankTune is a an open-source Python toolkit supporting end-to-end fair ranking workflows, analysis, auditing, and experimentation. FairRankTune provides researchers, practitioners, and educators with a self-contained module for generating ranked data, ranking strategies, and popular ranking-based fairness metrics.

For a quick overview, follow the Usage section.

For a in-depth overview, follow the Examples section.

✨ Features

🎨 Fairness-Aware Ranked Data Generation

RankTune is a pseudo-stochastic data generation method for creating fairness-aware ranked lists using the fairness concept of statistical parity. Included in the RankTune module, it creates ranking(s) based on the phi representativeness parameter. When phi = 0 then the generated ranked list(s) does not represent groups fairly, and as phi increases groups are represented more and more fairly; thus phi = 1 groups are fairly represented. RankTune uses a pseudo-random process to generate fairness-aware ranked data. RankTune can generate ranked data from user provided group sizes, from existing datasets, along with producing relevance scores accompanying the ranked list(s).

Please refer to the documentation for additional information.

📏 Metrics

FairRankTune provides several metrics for evaluating the fairness of ranked lists in the Metrics module. The table below provides a high-level overview of each metric. These metrics encompass a variety of fair ranking metrics, including both group and individual fairness, along with both score-based and statistical parity metrics.

Metric Abbreviation Fairness (Group or Individual) Score-based Statistical Parity Reference
Group Exposure EXP Group No Yes Singh et al.
Exposure Utility EXPU Group Yes No Singh et al.
Exposure Realized Utility EXPRU Group Yes No Singh et al.
Attention Weighted Rank Fairness AWRF Group No Yes Sapiezynski et al.
Exposure Rank Biased Precision Equality ERBE Group No No Kirnap et al.
Exposure Rank Biased Precision Proportionality ERBP Group No Yes Kirnap et al.
Exposure Rank Biased Precision Proportional to Relevance ERBR Group Yes No Kirnap et al.
Attribute Rank Parity ARP Group No Yes Cachel et al.
Normalized Discounted KL-Divergence NDKL Group No Yes Geyik et al.
Inequity of Amortized Attention IAA Individual Yes No Biega et al.

Please refer to the Metrics documentation for further details.

⚖️ Fair Ranking Methods

FairRankTune provides several fair ranking algorithms in the Rankers module. The DetConstSort and Epsilon-Greedy fair ranking algorithms can be used to re-rank a given ranking with the objective of making the resulting ranking fair.

Please refer to the documentation for further details.

🔌 Requirements

python>=3.8

As of v.0.0.6, FairRankTune requires python>=3.8.

💾 Installation

pip install FairRankTune

💡 Usage

🎨 Fairness-Aware Ranked Data Generation

RankTune can be used to generate ranking(s) from group_proportions, a numpy array with each group's proportion of the total items,num_items, by using the GenFromGroups() function.

import FairRankTune as frt
import numpy as np
import pandas as pd
from FairRankTune import RankTune, Metrics

#Generate a biased (phi = 0.1) ranking of 1000 items, with four groups of 100, 200, 300, and 400 items each.
group_proportions = np.asarray([.1, .2, .3, .4]) #Array of group proportions
num_items = 1000 #1000 items to be in the generated ranking
phi = 0.1
r_cnt = 1 #Generate 1 ranking
seed = 10 #For reproducibility
ranking_df, item_group_dict = frt.RankTune.GenFromGroups(group_proportions, num_items, phi, r_cnt, seed)

#Calculate EXP with a MinMaxRatio
EXP_minmax, avg_exposures_minmax = frt.Metrics.EXP(ranking_df, item_group_dict, 'MinMaxRatio')
print("EXP of generated ranking: ", EXP_minmax, "avg_exposures: ", avg_exposures_minmax)

Output:

EXP of generated ranking:  0.511665941043515 avg_exposures:  {0: 0.20498798214669187, 1: 0.13126425437156242, 2: 0.11461912123646827, 3: 0.10488536878769836}

Can confirm this is an unfair ranking by the low EXP value.

RankTune can be used to generate ranking(s) from item_group_dict, a dictionary of items where the keys are each item's group by using the GenFromItems() function.

import FairRankTune as frt
import numpy as np
import pandas as pd
from FairRankTune import RankTune, Metrics

#Generate a biased (phi = 0.1) ranking
item_group_dict = dict(Joe= "M",  David= "M", Bella= "W", Heidi= "W", Amy = "W", Jill= "W", Jane= "W", Dave= "M", Nancy= "W", Nick= "M")
phi = 0.1
r_cnt = 1 #Generate 1 ranking
seed = 10 #For reproducibility
ranking_df, item_group_dict = frt.RankTune.GenFromItems(item_group_dict, phi, r_cnt, seed)

#Calculate EXP with a MinMaxRatio
EXP_minmax, avg_exposures_minmax = frt.Metrics.EXP(ranking_df, item_group_dict, 'MinMaxRatio')
print("EXP of generated ranking: ", EXP_minmax, "avg_exposures: ", avg_exposures_minmax)

Output:

EXP of generated ranking:  0.5158099476966725 avg_exposures:  {'M': 0.6404015779112127, 'W': 0.33032550440724917}

We can confirm this is a biased ranking base don the low EXP score and large difference in average exposure between the 'M' and 'W' groups.

For further detail on how to use RankTune to generate relevance scores see the RankTune documentation.

📏 Metrics

import FairRankTune as frt
import pandas as pd
import numpy as np
ranking_df = pd.DataFrame(["Joe", "Jack", "Nick", "David", "Mark", "Josh", "Dave",
                          "Bella", "Heidi", "Amy"])
item_group_dict = dict(Joe= "M",  David= "M", Bella= "W", Heidi= "W", Amy = "W", Mark= "M", Josh= "M", Dave= "M", Jack= "M", Nick= "M")
#Calculate EXP with a MaxMinDiff
EXP, avg_exposures = frt.Metrics.EXP(ranking_df, item_group_dict, 'MaxMinDiff')
print("EXP: ", EXP, "avg_exposures: ", avg_exposures)

Output:

>>> EXP:  0.21786100126614577 avg_exposures:  {'M': 0.5197142341886783, 'W': 0.3018532329225326}

⚖️ Fair Ranking Algorithms

import FairRankTune as frt
import numpy as np
import pandas as pd
from FairRankTune import RankTune, Metrics
import random

#Generate a biased (phi = 0) ranking of 1000 items, with two groups of 100 and 900 items each.
group_proportions = np.asarray([.1, .9]) #Array of group proportions
num_items = 1000 #1000 items to be in the generated ranking
phi = 0 #Biased ranking
r_cnt = 1 #Generate 1 ranking
ranking_df, item_group_dict, scores_df = frt.RankTune.ScoredGenFromGroups(group_proportions,  num_items, phi, r_cnt, 'uniform', seed)

#Calculate EXP with a MinMaxRatio
EXP_minmax, avg_exposures_minmax = frt.Metrics.EXP(ranking_df, item_group_dict, 'MinMaxRatio')
print("EXP before Epsilon-Greedy: ", EXP_minmax, "avg_exposures before Epsilon-Greedy: ", avg_exposures_minmax)


#Rerank using Epsilon-Greedy
seed = 2 #For reproducibility
epsilon = .6 
reranking_df, item_group_d, reranking_scores = frt.Rankers.EPSILONGREEDY(ranking_df, item_group_dict, scores_df, epsilon, seed)

#Calculate EXP with a MinMaxRatio post Epsilon-Greedy
EXP, avg_exposures= frt.Metrics.EXP(reranking_df, item_group_d, 'MinMaxRatio')
print("EXP after Epsilon-Greedy: ", EXP, "avg_exposures after Epsilon-Greedy: ", avg_exposures)

Output:

EXP before Epsilon-Greedy:  0.5420744267551784 avg_exposures before Epsilon-Greedy:  {0: 0.2093867087428094, 1: 0.11350318011191189}
EXP after Epsilon-Greedy:  0.7689042373241246 avg_exposures after Epsilon-Greedy:  {0: 0.15541589156986096, 1: 0.1194999375755728}

We can see that the EXP fairness score improved from running Epsilon-Greedy. For more usage examples please see the documentation.

📖 Examples

Topic Link
Quickstart Open In Colab
RankTune Overview Open In Colab
RankTune Augmenting Datasets Open In Colab
Statistical Parity Metrics Open In Colab
Score-based (Group & Individual) Metrics Open In Colab
Using Fair Ranking Algorithms Open In Colab

📚 Documentation

Check out the documentation for more details and example notebooks.

🎓 Citation

If you end up using FairRankTune in your work, please consider citing it:

BibTeX
@misc{CachelFRT,
  author    = {Kathleen Cachel},
  title     = {FairRankTune: A Python Library for Fair Ranking},
  year = {2023},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/KCachel/fairranktune}}
}

⁉️ Feature Requests

We believe in open-source community driven software. Would you like to see other functionality implemented? Please, open a feature request. Is there a bug or issue ? Please, open a github issue.

👋 Want to contribute?

Would you like to contribute? Please, send me an e-mail.

📄 License

FairRankTune is open-sourced software licensed under the BSD-3-Clause license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairranktune-0.0.7.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fairranktune-0.0.7-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file fairranktune-0.0.7.tar.gz.

File metadata

  • Download URL: fairranktune-0.0.7.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.4

File hashes

Hashes for fairranktune-0.0.7.tar.gz
Algorithm Hash digest
SHA256 928473c11c4c157da124aede6b9c88f52322aaa6e1ff5a0434e777367662000c
MD5 ffa5130b0ace4e0e1acd34111d8a4bf3
BLAKE2b-256 df765846234b92a974660d9430f2d3ef02fe082131f81b201a701340b231c7d3

See more details on using hashes here.

File details

Details for the file fairranktune-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: fairranktune-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.4

File hashes

Hashes for fairranktune-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 6a46ec0dcd5a9d5e4271a4108747656036af44fa998f638623266369a5e9a28c
MD5 4682c9b6e92d2e827a9f0192b013dafc
BLAKE2b-256 0f977e0b930b9f711b463dbe41698eef89fb03d4ed881f6fbfd69a4906124f97

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page