Skip to main content

PyTorch dataset wrapper for the

Project description

fairness-datasets

PyPI version

PyTorch dataset wrappers for the several popular datasets from fair machine learning research.

The following datasets are wrapped:

Installation

pip install fairness-datasets

Basic Usage

from fairnessdatasets import Adult

# load (if necessary, download) the Adult training dataset 
train_set = Adult(root="datasets", download=True)
# load the test set
test_set = Adult(root="datasets", train=False, download=True)

inputs, target = train_set[0]  # retrieve the first sample of the training set

# iterate over the training set
for inputs, target in iter(train_set):
    ...  # Do something with a single sample

# use a PyTorch data loader
from torch.utils.data import DataLoader

loader = DataLoader(test_set, batch_size=32, shuffle=True)
for epoch in range(100):
    for inputs, targets in iter(loader):
        ...  # Do something with a batch of samples

You can use Adult(..., raw=True) to turn off the one-hot encoding and z-score normalization applied by the Adult class by default.

The remaining dataset classes can be used in the same way as Adult. However, these datasets don't come with a fixed train/test split, so that the dataset instances always contain all data. To create a train/test split, use

from fairnessdatasets import Default
from torch.utils.data import random_split

dataset = Default(root="datasets", download=True)

rng = torch.Generator().manual_seed(42)  # for reproducible results
train_set, test_set = random_split(dataset, [0.7, 0.3], generator=rng)

Advanced Usage

Turn off status messages while downloading the dataset:

Adult(root=..., output_fn=None)

Use the logging module for logging status messages while downloading the dataset instead of placing the status messages on sys.stdout.

import logging

Adult(root=..., output_fn=logging.info)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairness_datasets-0.4.0.tar.gz (17.5 kB view details)

Uploaded Source

Built Distribution

fairness_datasets-0.4.0-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file fairness_datasets-0.4.0.tar.gz.

File metadata

  • Download URL: fairness_datasets-0.4.0.tar.gz
  • Upload date:
  • Size: 17.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.31.0

File hashes

Hashes for fairness_datasets-0.4.0.tar.gz
Algorithm Hash digest
SHA256 bca7dc534f064b9941a1a10083c84c684ef4d34fe6f2b4aa2ed7d33cdf7da2d1
MD5 71b19974c05c66f1f4943527b6f5d7fa
BLAKE2b-256 9c4bd0d6e6d40adab59773b33141bacfd380f706f0cf908b40d148b60701f8f4

See more details on using hashes here.

File details

Details for the file fairness_datasets-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for fairness_datasets-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ca2c711f0ca768457bcf5f48cc33ae9a40e10eb77df94e0e16f9c73309b675aa
MD5 ff61436a2a62d66eece090a00650587f
BLAKE2b-256 7d1ab16b824d2ec2bbbe24809b30ce148f77c4510761a104dbe401c00e57e588

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page