A toolkit for understanding and researching algorithmic bias

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

README

EthicML exists to combat the problems we've found with off-the-shelf fairness comparison packages.

These other packages are useful, but given that we primarily do research, a lot of the work we do doesn't fit into some nice box. For example, we might want to use a 'fair' pre-processing method on the data before training a classifier on it. We may still be experimenting and only want part of the framework to execute, or we may want to do hyper-parameter optimization. Whilst other frameworks can be modified to do these tasks, you end up with hacked-together approaches that don't lend themselves to be built on in the future. Because of this, we're drawing a line in the sand with some of the other frameworks we've used and building our own.

Why not use XXX?

There are an increasing number of other options, IBM's fair-360, Aequitas, EthicalML/XAI, Fairness-Comparison and others. They're all great at what they do, they're just not right for us. We will however be influenced by them.

Design Principles

The Triplet

Given that we're considering fairness, the base of the toolbox is the triplet {x, s, y}

X - Features
S - Sensitive Label
Y - Class Label

All methods must assume S and Y are multi-class.

We use a named tuple to contain the triplet

triplet = DataTuple(x=dataframe, s=dataframe, y=dataframe)

The dataframe may be a little innefficient, but given the amount of splicing on conditions that we're doing it feels worth it.

Separation of Methods

We purposefully keep pre, during and post algorithm methods separate. This is because they have different return types.

pre-algorithm.run(train: DataTuple, test: DataTuple) -> Tuple[pandas.DataFrame, pandas.DataFrame]
in-algorithm.run(train: DataTuple, test: DataTuple) -> pandas.DataFrame
post-algorithm.run(preds: DataFrame, test: DataTuple) -> pandas.DataFrame

where preds is a one column dataframe with the column name 'preds'.

General Rules of Thumb

Mutable data structures are bad.
At the very least, functions should be Typed.
Readability > Efficiency
Don't get around warnings by just turning them off...

Future Plans

Hopefully EthicML becomes a super easy way to look at the biases in different datasets and get a comparison of different models.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.3.0

Nov 11, 2023

1.2.3

Sep 29, 2023

1.2.2

Sep 20, 2023

1.2.1

May 13, 2023

1.2.0

Mar 28, 2023

1.1.0

Jan 16, 2023

1.0.2

Dec 16, 2022

1.0.1

Oct 9, 2022

1.0.0

Sep 28, 2022

1.0.0b2 pre-release

Sep 2, 2022

1.0.0b1 pre-release

Aug 18, 2022

1.0.0a4 pre-release

Jun 28, 2022

1.0.0a3 pre-release

Jun 13, 2022

1.0.0a2 pre-release

Jun 6, 2022

1.0.0a1 pre-release

May 28, 2022

1.0.0a0 pre-release

May 27, 2022

0.7.3

Jul 21, 2022

0.7.2

Jun 28, 2022

0.7.1

Mar 18, 2022

0.7.0

Mar 17, 2022

0.6.0

Mar 11, 2022

0.5.1

Dec 23, 2021

0.5.0

Dec 22, 2021

0.4.1

Dec 21, 2021

0.4.0

Dec 10, 2021

0.3.5

Nov 12, 2021

0.3.4

Oct 13, 2021

0.3.3

Sep 27, 2021

0.3.2

Aug 24, 2021

0.3.1

Jul 30, 2021

0.3.0

Jul 30, 2021

0.2.7

Jul 5, 2021

0.2.6

Jun 29, 2021

0.2.5

Jun 14, 2021

0.2.4

Jun 14, 2021

0.2.3

Jun 3, 2021

0.2.2

Mar 14, 2021

0.2.1

Mar 13, 2021

0.2.0

Jan 25, 2021

0.1.0a10 pre-release

Sep 7, 2020

0.1.0a9 pre-release

Aug 19, 2020

0.1.0a8 pre-release

May 30, 2020

0.1.0a7 pre-release

Apr 17, 2020

0.1.0a6 pre-release

Jan 17, 2020

0.1.0a5 pre-release

Jan 6, 2020

0.1.0a3 pre-release

Aug 22, 2019

This version

0.1.0a2 pre-release

Aug 6, 2019

0.1.0a1 pre-release

Jul 1, 2019

0.1.0a0 pre-release

Jun 27, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

EthicML-0.1.0a2.tar.gz (2.5 MB view hashes)

Uploaded Aug 6, 2019 Source

Built Distribution

EthicML-0.1.0a2-py3-none-any.whl (2.9 MB view hashes)

Uploaded Aug 6, 2019 Python 3

Hashes for EthicML-0.1.0a2.tar.gz

Hashes for EthicML-0.1.0a2.tar.gz
Algorithm	Hash digest
SHA256	`511a6a3bedaaf1e84606c6eaa75aae1f34d53e3f4792e8d9571fb3f7398d6b51`
MD5	`5205a88a2a9250a4211d144d3d0d8613`
BLAKE2b-256	`2bc745732014431f8f78bc7845ebb8e5bdc33043795dd4e590ece9c5672634ca`

Hashes for EthicML-0.1.0a2-py3-none-any.whl

Hashes for EthicML-0.1.0a2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a8794c057e3df147ab2ae5215a9cc45e18bd9c48a162d750afa7054ee25cc716`
MD5	`c6c38ffdc14d8bd32f3a2c6cb6ff2f63`
BLAKE2b-256	`9abf733d8352cc2c96d1576ba2a71034f262066c0fb0923bd9b0e57043bf9b64`