Skip to main content

A Python library with the core algorithms used to do fair search.

Project description

# Fair search core for Python

[![image](https://img.shields.io/pypi/status/fairsearchcore.svg)](https://pypi.org/project/fairsearchcore/)
[![image](https://img.shields.io/pypi/v/fairsearchcore.svg)](https://pypi.org/project/fairsearchcore/)
[![image](https://img.shields.io/pypi/pyversions/fairsearchcore.svg)](https://pypi.org/project/fairsearchcore/)
[![image](https://img.shields.io/pypi/l/fairsearchcore.svg)](https://pypi.org/project/fairsearchcore/)
[![image](https://img.shields.io/pypi/implementation/fairsearchcore.svg)](https://pypi.org/project/fairsearchcore/)

This is the Python library with the core algorithms used to do [FA*IR](https://arxiv.org/abs/1706.06368) ranking.

## Installation
To install `fairsearchcore`, simply use `pip` (or `pipenv`):
```bash
pip install fairsearcore
```
And, that's it!

## Using it in your code
You need to import the package first:
```{.sourceCode .python}
import fairsearchcore as fsc
```
Creating and analyzing mtables:
```{.sourceCode .python}
k = 20 # number of topK elements returned (value should be between 10 and 400)
p = 0.25 # proportion of protected candidates in the topK elements (value should be between 0.02 and 0.98)
alpha = 0.1 # significance level (value should be between 0.01 and 0.15)

# create the Fair object
fair = fsc.Fair(k, p, alpha)

# create an mtable using alpha unadjusted
mtable = fair.create_unadjusted_mtable()
>> [0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3]

# analytically calculate the fail probability
analytical = fair.compute_fail_probability(mtable)
>> 0.11517506930977106

# create an mtable using alpha adjusted
mtable = fair.create_adjusted_mtable()
>> [0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2]

# again, analytically calculate the fail probability
analytical = fair.compute_fail_probability(mtable)
>> 0.13421772800000065

```
Generate random rankings and analyze them:
```{.sourceCode .python}
M = 10000 # number of rankings you want to generate (works better with big numbers)

# generate rankings using the simulator (M lists of k objects of class fairsearchcore.models.FairScoreDoc)
rankings = fsc.generate_rankings(M, k, p)
>> [[<FairScoreDoc [Protected]>, <FairScoreDoc [Nonprotected]>, <FairScoreDoc [Protected]>,
<FairScoreDoc [Protected]>, <FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>,
<FairScoreDoc [Nonprotected]>, <FairScoreDoc [Protected]>, <FairScoreDoc [Nonprotected]>,
<FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>,
<FairScoreDoc [Nonprotected]>, <FairScoreDoc [Protected]>, <FairScoreDoc [Nonprotected]>,
<FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>,
<FairScoreDoc [Nonprotected]>, <FairScoreDoc [Protected]>],...]

# experimentally calculate the fail probability
experimental = fsc.compute_fail_probability(mtable, rankings)
>> 0.1076
```
Apply a fair re-ranking to a given ranking:
```
# import the FairScoreDoc class
from fairsearchcore.models import FairScoreDoc

# let's manually create an unfair ranking (False -> unprotexted, True -> protected)
unfair_ranking = [FairScoreDoc(20, 20, False), FairScoreDoc(19, 19, False), FairScoreDoc(18, 18, False),
FairScoreDoc(17, 17, False), FairScoreDoc(16, 16, False), FairScoreDoc(15, 15, False),
FairScoreDoc(14, 14, False), FairScoreDoc(13, 13, False), FairScoreDoc(12, 12, False),
FairScoreDoc(11, 11, False), FairScoreDoc(10, 10, False), FairScoreDoc(9, 9, False),
FairScoreDoc(8, 8, False), FairScoreDoc(7, 7, False), FairScoreDoc(6, 6, True),
FairScoreDoc(5, 5, True), FairScoreDoc(4, 4, True), FairScoreDoc(3, 3, True),
FairScoreDoc(2, 2, True), FairScoreDoc(1, 1, True)]

# now re-rank the unfair ranking
fair.re_rank(unfair_ranking)
>> [<FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>,
<FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>,
<FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>, <FairScoreDoc [Protected]>,
<FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>,
<FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>, <FairScoreDoc [Nonprotected]>,
<FairScoreDoc [Protected]>, <FairScoreDoc [Protected]>, <FairScoreDoc [Protected]>,
<FairScoreDoc [Protected]>, <FairScoreDoc [Protected]>]
```

The library contains sufficient code documentation for each of the functions.

## Development

1. Clone this repository `git clone https://github.com/fair-search/fairsearchcore-python.git`
2. Change directory to the directory where you cloned the repository `cd WHERE_ITS_DOWNLOADED/fairsearchcore-python`
3. Use any IDE to work with the code

## Testing

Just run:
```
python setup.py test
```
*Note*: The simulator tests take a *looong* time to execute.

## Credits

The FA*IR algorithm is described on this paper:

* Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, Ricardo Baeza-Yates: "[FA*IR: A Fair Top-k Ranking Algorithm](https://doi.org/10.1145/3132847.3132938)". Proc. of the 2017 ACM on Conference on Information and Knowledge Management (CIKM).

This code was developed by [Ivan Kitanovski](http://ivankitanovski.com/) based on the paper. See the [license](https://github.com/fair-search/fairsearch-core/blob/master/python/LICENSE) file for more information.

## See also

You can also see the [FA*IR plug-in for ElasticSearch](https://github.com/fair-search/fairsearch-elasticsearch-plugin).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairsearchcore-1.0.4.tar.gz (8.6 kB view details)

Uploaded Source

File details

Details for the file fairsearchcore-1.0.4.tar.gz.

File metadata

  • Download URL: fairsearchcore-1.0.4.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.19.1 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for fairsearchcore-1.0.4.tar.gz
Algorithm Hash digest
SHA256 ca15304579a0c9b3135246a1ad41a23367d963e144269553f6200ad608ac44d6
MD5 2364bcac2300e10f0b9086c3637a910b
BLAKE2b-256 5a10493100010291cc8f69fa8c3c9ff59a29d54f6149ba46c188fa50bee368e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page